Contact
Name | Peter Roch |
---|---|
Position | Researcher |
Phone | +49-201-183-6370 |
Fax | +49-201-183-4176 |
peter.roch@uni-due.de | |
Address | Schützenbahn 70 Building SA 45127 Essen |
Room | SA-118 |
Research Interest
Robotics
Computer Vision
Electric vehicle navigation
Education
Master of Science – Universität Duisburg-Essen, Studiengang: Software and Network Engineering, 2019
Bachelor of Science – Universität Duisburg-Essen, Studiengang: Angewandte Informatik – Systems Engineering, 2017
Publications
2024 |
Bijan Shahbaz Nejad, Peter Roch, Marcus Handte, Pedro José Marrón: A visual foreign object detection system for wireless charging of electric vehicles. In: Machine Vision and Applications, vol. 35, no. 4, pp. 69, 2024, ISSN: 1432-1769. (Type: Journal Article | Abstract | Links)@article{shahbaznejad_2024, Wireless charging of electric vehicles can be achieved by installing a transmitter coil into the ground and a receiver coil at the underbody of a vehicle. In order to charge efficiently, accurate alignment of the charging components must be accomplished, which can be achieved with a camera-based positioning system. Due to an air gap between both charging components, foreign objects can interfere with the charging process and pose potential hazards to the environment. Various foreign object detection systems have been developed with the motivation to increase the safety of wireless charging. In this paper, we propose a foreign object detection technique that utilizes the integrated camera of an embedded positioning system. Due to operation in an outdoor environment, we cannot determine the types of objects that may occur in advance. Accordingly, our approach achieves object-type independence by learning the features of the charging surface, to then classify anomalous regions as foreign objects. To examine the capability of detecting foreign objects, we evaluate our approach by conducting experiments with images depicting known and unknown object types. For the experiments, we use an image dataset recorded by a positioning camera of an operating wireless charging station in an outdoor environment, which we published alongside our research. As a benchmark system, we employ YOLOv8 (Jocher et al. in Ultralytics YOLO, 2023), a state-of-the-art neural network that has been used in various contexts for foreign object detection. While we acknowledge the performance of YOLOv8 for known object types, our approach achieves up to 18% higher precision and 46% higher detection success for unknown objects. |
2023 |
Peter Roch, Bijan Shahbaz Nejad, Marcus Handte, Pedro José Marrón: Positionierung induktiv geladener Fahrzeuge. In: Proff, Heike, Clemens, Markus, Marrón, Pedro José, Schmülling, Benedikt (Ed.): Induktive Taxiladung für den öffentlichen Raum: Technische und betriebswirtschaftliche Aspekte, pp. 93–142, 2023, ISBN: 978-3-658-39979-5. (Type: Proceedings Article | Abstract | Links)@inproceedings{talako-book-chapter, Ziel des TALAKO Projekts ist es, kabelloses Laden von Elektrofahrzeugen im öffentlichen Raum zu ermöglichen. Induktives Laden erfordert eine präzise Ausrichtung des Fahrzeugs, um einen effizienten Ladevorgang zu gewährleisten. Dabei hat die Ausrichtung des Fahrzeugs direkten Einfluss auf den Wirkungsgrad. Der Positionierungsvorgang kann für den Fahrer herausfordernd sein, da er den Versatz der Ladekomponenten ohne weitere Unterstützung nicht wahrnehmen kann. Daher umfasst die entwickelte Anlage neben der induktiven Ladeinfrastruktur selbst ebenfalls ein kamerabasiertes Fahrerassistenzsystem. Das Fahrerassistenzsystem wird dazu genutzt, anfahrende Fahrzeuge zu erkennen und den Fahrer beim Positionierungsvorgang zu unterstützen. Es besteht aus zwei Komponenten: einem kamerabasierten Positionierungssystem und einer Fahrerleitanwendung. Das Positionierungssystem nutzt Kamerabilder, um die Position von Fahrzeugen mit einer Genauigkeit von 5 cm zu berechnen. Daraus wird der Abstand zwischen Fahrzeug und Ladeplatte abgeleitet. Die Fahrerleitanwendung interpretiert die Positionsinformationen und generiert daraufhin geeignete Anweisungen für den Fahrer. Das Positionierungssystem basiert auf einem neuronalen Netz, welches die Reifen des Fahrzeugs erkennt. Da der Abstand zwischen den Reifen bekannt ist, kann daraus die Position und Rotation des Fahrzeugs errechnet werden. Untersuchungen haben ergeben, dass die Genauigkeit im Bereich von 5 cm liegt. Um das Positionierungssystem unabhängig vom Fahrzeugtyp und Installationsort zu betreiben, muss es entsprechend konfiguriert werden. Dazu muss das neuronale Netz trainiert und die Kameraausrichtung kalibriert werden. Das Training des neuronalen Netzes wird mit synthetisch generierten Bildern ergänzt, welche mit einem eigens entwickelten Bildgenerator produziert werden können. Die Kameraausrichtung wird mit einem speziellen Muster bestimmt, welches an verschiedenen Stellen auf dem Untergrund platziert wird. Da die realen Maße des Musters bekannt sind, lässt sich daraus die Geometrie des Installationsortes ableiten. Im Rahmen einer Nutzerstudie wurde untersucht, welche Bildschirmmodalität für die Fahrerleitanwendung unter den gegebenen Umständen optimal eingesetzt werden kann. Die Studie hat ergeben, dass Nutzer einen im Fahrzeug befindlichen Bildschirm für die Ausgabe von Anweisungen bevorzugen. Daher wurde die Fahrerleitanwendung durch eine mobile Anwendung realisiert. Diese zeigt dem Fahrer die Position des Fahrzeugs in Relation zur Ladestation an. Für die Darstellung der räumlichen Relationen wurden verschiedene Visualisierungen miteinander verglichen. Mit mehreren Visualisierungen sind die Nutzer in der Lage, das Fahrzeug in einem Toleranzbereich von 5 cm zu positionieren. Die meisten Nutzer bevorzugen jedoch eine Darstellung aus der Vogelperspektive. Die Kommunikation der beiden Komponenten wurde mittels Bluetooth Low Energy umgesetzt. Im Gegensatz zu anderen drahtlosen Kommunikationsmöglichkeiten, wie z. B. WLAN, bietet dies den Vorteil, dass Informationen ohne Verzögerung eines Verbindungsaufbaus an die mobile Anwendung gesendet werden können. Dadurch kann der Fahrer unmittelbar nach Ankunft an der Anlage die Positionierung verzögerungsfrei starten. Das Gesamtsystem wurde prototypisch bei einem Taxiunternehmen in Mülheim a. d. R. (Auto Stephany GmbH (2012) Auto Stephany GmbH – Taxi Dienstleistungen. Abgerufen am 04. 08. 2022 von https://taxi-stephany.de/) in Betrieb genommen und über mehrere Monate iterativ optimiert. Während dieser Zeit wurden wertvolle Erfahrungen gesammelt, die dazu beigetragen haben, dass sowohl das Positionierungssystem als auch die Fahrerleitanwendung stetig verbessert wurden. Nach Abschluss der Optimierungen konnte das entwickelte System erfolgreich als Bestandteil der Pilotanlage in Köln mit mehreren Ladeplätzen eingesetzt werden. Da die Pilotanlage in Köln im öffentlichen Raum betrieben wird, müssen die Persönlichkeitsrechte einzelner Personen beachtet werden. Eine explizite Einwilligung in die Datenverarbeitung durch die Betroffenen ist jedoch nicht praktikabel. Daher wurde eine automatisierte Verschleierung eingesetzt, welche personenbezogene Daten wie Kennzeichen und Gesichter aus den Kamerabildern entfernt, um eine Verarbeitung zu vermeiden. |
Bijan Shahbaz Nejad, Peter Roch, Marcus Handte, Pedro José Marrón: Visual Foreign Object Detection for Wireless Charging of Electric Vehicles. In: George, Bebis, Golnaz, Ghiasi, Yi, Fang, Andrei, Sharf, Yue, Dong, Chris, Weaver, Zhicheng, Leo, J., LaViola Jr. Joseph, Luv, Kohli (Ed.): Advances in Visual Computing, pp. 188–201, Springer Nature Switzerland, 2023, ISBN: 978-3-031-47966-3. (Type: Proceedings Article | Abstract)@inproceedings{fod_wc, Wireless charging of electric vehicles can be achieved by installing a transmitter coil into the ground and a receiver coil at the underbody of a vehicle. In order to charge efficiently, accurate alignment of the charging components must be accomplished, which can be achieved with a camera-based positioning system. Due to an air gap between both charging components, foreign objects can interfere with the charging process and pose potential hazards to the environment. Various foreign object detection systems have been developed with the motivation to increase the safety of wireless charging. In this paper, we propose an object-type independent foreign object detection technique which utilizes the existing camera of an embedded positioning system. To evaluate our approach, we conduct two experiments by analyzing images from a dataset of a wireless charging surface and from a publicly available dataset depicting foreign objects in an airport environment. Our technique outperforms two background subtraction algorithms and reaches accuracy scores that are comparable to the accuracy achieved by a state-of-the-art neural network (~97%). While acknowledging the superior accuracy results of the neural network, we observe that our approach requires significantly less resources, which makes it more suitable for embedded devices. The dataset of the first experiment is published alongside this paper and consists of 3652 labeled images recorded by a positioning camera of an operating wireless charging station in an outdoor environment. |
Peter Roch, Bijan Shahbaz Nejad, Marcus Handte, Pedro José Marrón: Optimizing PnP-Algorithms for Limited Point Correspondences Using Spatial Constraints. In: George, Bebis, Golnaz, Ghiasi, Yi, Fang, Andrei, Sharf, Yue, Dong, Chris, Weaver, Zhicheng, Leo, J., LaViola Jr. Joseph, Luv, Kohli (Ed.): Advances in Visual Computing, pp. 215–229, Springer Nature Switzerland, 2023, ISBN: 978-3-031-47966-3. (Type: Proceedings Article | Abstract)@inproceedings{limit_pnp, Pose Estimation is an important component of many real-world computer vision systems. Most existing pose estimation algorithms need a large number of point correspondences to accurately determine the pose of an object. Since the number of point correspondences depends on the object’s appearance, lighting and other external conditions, detecting many points may not be feasible. In many real-world applications, movement of objects is limited due to gravity. Hence, detecting objects with only three degrees of freedom is usually sufficient. This allows us to improve the accuracy of pose estimation by changing the underlying equation of the perspective-n-point problem to allow only three variables instead of six. By using the improved equations, our algorithm is more robust against detection errors with limited point correspondences. In this paper, we specify two scenarios where such constraints apply. The first one is about parking a vehicle on a specific spot, while the second scenario describes a camera observing objects from a bird’s-eye view. In both scenarios, objects can only move in the ground plane and rotate around the vertical axis. Experiments with synthetic data and real-world photographs have shown that our algorithm outperforms state-of-the-art pose estimation algorithms. Depending on the scenario, our algorithm usually achieves 50% better accuracy, while being equally fast. |
2022 |
Bijan Shahbaz Nejad, Peter Roch, Marcus Handte, Pedro José Marrón: Enhancing Privacy in Computer Vision Applications: An Emotion Preserving Approach to Obfuscate Faces. In: Bebis, George, Li, Bo, Yao, Angela, Liu, Yang, Duan, Ye, Lau, Manfred, Khadka, Rajiv, Crisan, Ana, Chang, Remco (Ed.): Advances in Visual Computing, pp. 80–90, Springer Nature Switzerland, 2022, ISBN: 978-3-031-20716-7. (Type: Proceedings Article | Abstract | Links)@inproceedings{epic, Computer vision offers many techniques to facilitate the extraction of semantic information from images. If the images include persons, preservation of privacy in computer vision applications is challenging, but undoubtedly desired. A common technique to prevent exposure of identities is to cover peoples' faces with, for example, a black bar. Although emotions are crucial for reasoning in many applications, facial expressions may be covered, which hinders the recognition of actual emotions. Thus, recorded images containing obfuscated faces may be useless for further analysis and investigation. We introduce an approach that enables automatic detection and obfuscation of faces. To avoid privacy conflicts, we use synthetically generated faces for obfuscation. Furthermore, we reconstruct the facial expressions of the original face, adjust the color of the new face and seamlessly clone it to the original location. To evaluate our approach experimentally, we obfuscate faces from various datasets by applying blurring, pixelation and the proposed technique. To determine the success of obfuscation, we verify whether the original and the resulting face represent the same person using a state-of-the-art matching tool. Our approach successfully obfuscates faces in more than 97{%} of the cases. This performance is comparable to blurring, which scores around 96{%}, and even better than pixelation (76{%}). Moreover, we analyze how effectively emotions can be preserved when obfuscating the faces. For this, we utilize emotion recognizers to recognize the depicted emotions before and after obfuscation. Regardless of the recognizer, our approach preserves emotions more effectively than the other techniques while preserving a convincingly natural appearance. |
Peter Roch, Bijan Shahbaz Nejad, Marcus Handte, Pedro José Marrón: GUILD - A Generator for Usable Images in Large-Scale Datasets. In: Bebis, George, Li, Bo, Yao, Angela, Liu, Yang, Duan, Ye, Lau, Manfred, Khadka, Rajiv, Crisan, Ana, Chang, Remco (Ed.): Advances in Visual Computing, pp. 245–258, Springer Nature Switzerland, 2022, ISBN: 978-3-031-20716-7. (Type: Proceedings Article | Abstract | Links)@inproceedings{guild, Large image datasets are important for many different aspects of computer vision. However, creating datasets containing thousands or millions of labeled images is time consuming. Instead of manual collection of a large dataset, we propose a framework for generating large-scale datasets synthetically. Our framework is capable of generating realistic looking images with varying environmental conditions, while automatically creating labels. To evaluate usefulness of such a dataset, we generate two datasets containing vehicle images. Afterwards, we use these images to train a neural network. We then compare detection accuracy to the same neural network trained with images of existing datasets. The experiments show that our generated datasets are well-suited to train neural networks and achieve comparable accuracy to existing datasets containing real photographs, while they are much faster to create. |
2021 |
Alexander Julian Golkowski, Marcus Handte, Peter Roch, Pedro José Marrón: An Experimental Analysis of the Effects of Different Hardware Setups on Stereo Camera Systems . In: International Journal of Semantic Computing, vol. 15, no. 3, pp. 337–357, 2021, ISSN: 1793-7108. (Type: Journal Article | Abstract | Links)@article{nokey, For many application areas such as autonomous navigation, the ability to accurately perceive the environment is essential. For this purpose, a wide variety of well-researched sensor systems are available that can be used to detect obstacles or navigation targets. Stereo cameras have emerged as a very versatile sensing technology in this regard due to their low hardware cost and high fidelity. Consequently, much work has been done to integrate them into mobile robots. However, the existing literature focuses on presenting the concepts and algorithms used to implement the desired robot functions on top of a given camera setup. As a result, the rationale and impact of choosing this camera setup are usually neither discussed nor described. Thus, when designing the stereo camera system for a mobile robot, there is not much general guidance beyond isolated setups that worked for a specific robot. To close the gap, this paper studies the impact of the physical setup of a stereo camera system in indoor environments. To do this, we present the results of an experimental analysis in which we use a given software setup to estimate the distance to an object while systematically changing the camera setup. Thereby, we vary the three main parameters of the physical camera setup, namely the angle and distance between the cameras as well as the field of view and a rather soft parameter, the resolution. Based on the results, we derive several guidelines on how to choose the parameters for an application. |
Peter Roch, Bijan Shahbaz Nejad, Marcus Handte, Pedro José Marrón: Car Pose Estimation through Wheel Detection. In: Bebis, George, Athitsos, Vassilis, Yan, Tong, Lau, Manfred, Li, Frederick, Shi, Conglei, Yuan, Xiaoru, Mousas, Christos, Bruder, Gerd (Ed.): Advances in Visual Computing, pp. 265–277, Springer International Publishing, 2021, ISBN: 978-3-030-90439-5. (Type: Proceedings Article | Abstract | Links)@inproceedings{car-pose-estimation, Car pose estimation is an essential part of different applications, including traffic surveillance, Augmented Reality (AR) guides or inductive charging assistance systems. For many systems, the accuracy of the determined pose is important. When displaying AR guides, a small estimation error can result in a different visualization, which will be directly visible to the user. Inductive charging assistance systems have to guide the driver as precise as possible, as small deviations in the alignment of the charging coils can decrease charging efficiency significantly. For accurate pose estimation, matches between image coordinates and 3d real-world points have to be determined. Since wheels are a common feature of cars, we use the wheelbase and rim radius to compute those real-world points. The matching image coordinates are obtained by three different approaches, namely the circular Hough-Transform, ellipse-detection and a neural network. To evaluate the presented algorithms, we perform different experiments: First, we compare their accuracy and time performance regarding wheel-detection in a subset of the images of The Comprehensive Cars (CompCars) dataset. Second, we capture images of a car at known positions, and run the algorithms on these images to estimate the pose of the car. Our experiments show that the neural network based approach is the best in terms of accuracy and speed. However, if training of a neural network is not feasible, both other approaches are accurate alternatives. |
Bijan Shahbaz Nejad, Peter Roch, Marcus Handte, Pedro José Marrón: Evaluating User Interfaces for a Driver Guidance System to Support Stationary Wireless Charging of Electric Vehicles. In: Bebis, George, Athitsos, Vassilis, Yan, Tong, Lau, Manfred, Li, Frederick, Shi, Conglei, Yuan, Xiaoru, Mousas, Christos, Bruder, Gerd (Ed.): Advances in Visual Computing, pp. 183–196, Springer International Publishing, 2021, ISBN: 978-3-030-90439-5. (Type: Proceedings Article | Links)@inproceedings{10.1007/978-3-030-90439-5_15, |
2020 |
Peter Roch, Bijan Shahbaz Nejad, Marcus Handte, Pedro José Marrón: Systematic Optimization of Image Processing Pipelines Using GPUs. In: Bebis, George, Yin, Zhaozheng, Kim, Edward, Bender, Jan, Subr, Kartic, Kwon, Bum Chul, Zhao, Jian, Kalkofen, Denis, Baciu, George (Ed.): Advances in Visual Computing, pp. 633–646, Springer International Publishing, Cham, 2020, ISBN: 978-3-030-64559-5. (Type: Proceedings Article | Abstract | Links)@inproceedings{image-processing-pipeline-optimization, Real-time computer vision systems require fast and efficient image processing pipelines. Experiments have shown that GPUs are highly suited for image processing operations, since many tasks can be processed in parallel. However, calling GPU-accelerated functions requires uploading the input parameters to the GPU's memory, calling the function itself, and downloading the result afterwards. In addition, since not all functions benefit from an increase in parallelism, many pipelines cannot be implemented exclusively using GPU functions. As a result, the optimization of pipelines requires a careful analysis of the achievable function speedup and the cost of copying data. In this paper, we first define a mathematical model to estimate the performance of an image processing pipeline. Thereafter, we present a number of micro-benchmarks gathered using OpenCV which we use to validate the model and which quantify the cost and benefits for different classes of functions. Our experiments show that comparing the function speedup without considering the time for copying can overestimate the achievable performance gain of GPU acceleration by a factor of two. Finally, we present a tool that analyzes the possible combinations of CPU and GPU function implementations for a given pipeline and computes the most efficient composition. By using the tool on their target hardware, developers can easily apply our model to optimize their application performance systematically. |