News, events and videos from the project. The nxtAIM research enables an unlimited learning horizon through generatively generated data and is supplemented by real-time verification of perception.
News
Projectvideos
What does the video show?
You can see a sequence that was created using the CARLA simulation tool. In CARLA, street scenes can be simulated from the EGO perspective of a vehicle This has the advantage that the environment as well as the sequence of the scene can be determined. In addition to the simulation, for each frame of the sequence, annotations are obtained, such as semantic segmentation masks. In nxtAIM, simulated road scenes are to be photo realistically enhanced using conditioned Generative Adversarial Networks (GAN), whose discriminator and generator are deep convolutional networks (cDCGAN). The team will condition on the synthetic segmentation mask and use the specific cDCGAN pix2pixHD because of its discriminator and generator architecture, which allows it to generate high-resolution images. The video on the left shows the semantic segmentation mask, in which each pixel of the image is classified. This segmentation mask is then fed into the pix2pixHD generator, resulting in the photo realistically enhanced image on the right side. To achieve such results, pix2pixHD was trained on the publicly available real dataset A2D2.
What is being researched here?
The research aims to adapt simulated images from the synthetic domain to the domain of the real world. This is achieved using conditional GANs. The problem is that annotating images by humans is time-consuming and expensive. On the other hand, when using simulators such as CARLA, the desired annotations are obtained automatically for the respective image. For the training of semantic segmentation networks, one would like to have as much suitable annotated data as possible. It would be obvious to suggest that the data from the simulation could be used.
The major problem here, however, is the gap between the domains of the synthetic and the real world. If one trains a semantic segmentation network on synthetic data and then wants to use it in an autonomous vehicle in the real world, this will not be successful. The aim of the research is therefore to refine the simulated images photo realistically so that, on the one hand, a larger amount of data is available for training the networks. On the other hand, one can also control how scenarios unfold in the simulation, meaning that one can, for example, simulate safety-critical scenes, such as the sudden appearance of a pedestrian on the street. Such scenarios cannot be replicated in the real world because the risk to all parties involved is far too high. As a result, safety-critical scenarios are not included in the training data of real datasets. By photo realistically enhancing such scenes, they can ultimately be used for the validation of semantic segmentation networks that have been trained on real data. This checks if these segmentation networks can also perform well in safety-critical situations.
Classification within nxtAIM
nxtAIM will use generative learning methods, in particular GAN, to photo realistically enhance simulated street scenes. These generated images can be used for various purposes: For training and validating semantic segmentation networks, but also to explore new suitable metrics that measure the difference between simulated and real images.
What does the video show?
The video shows a semantic segmentation of image sequences, meaning that each pixel in the image is assigned to a specific class. In the upper part of the video, these segments are tracked over time, with neighboring pixels of the same class being grouped together. Each color represents a unique segment ID.
What is being researched here?
Here, the behavior of road users over time is examined and the associated prediction of the neural networks is researched. This makes it possible to determine how well the artificial networks predict future events or trends based on available data.
Classification within nxtAIM
The video shows street scenes. In nxtAIM, time series of trajectories, temporal progressions of movement paths, are examined and analyzed in such scenes.
What does the video show?
In the video, the new simulator SLEDGE is showcased, which creates a driving environment using generative models.
In the first section, three simulation modes are compared:
- Log replay: The driving environment is taken from a dataset.
- Lane > Agent: The road layout is from a dataset, and active traffic participants such as cars or pedestrians are generated by a generative model.
- Lane & Agent: The entire driving environment, including road layout and traffic participants, is created by a generative model.
The second section shows how the generative model can be used to perform long simulations, which was not possible before. And the third section shows that long simulations can be used to find the error modes of autonomous movement planners.
What is being researched here?
Architectures of generative models and simulators for geometric environment models in autonomous driving are investigated.
Classification within nxtAIM
As part of the nxtAIM project, the simulator is being used to better test motion planners in autonomous driving using generative models.