Visual Perception and Spatial Computing (12 hours)
Teacher: Luigi Freda (PhD, Robotics & Computer Vision Engineer/Researcher)
Today, commodity cameras are used extensively everywhere (e.g. robots, cars, smartphones, AR/VR headsets,
wearable devices, etc.). These lightweight and low-cost sensors provide very rich information which allows to build a
3D model of the surrounding environment and understand its structure. This course introduces perception methods and tools for building 3D models and extract their semantic structure. Such a capability crucially enables AI systems to attain an intelligent and long-term scene interaction.
To this aim, we provide an introduction to visual SLAM and real-time techniques, focusing on how to
* robustly localize a camera system with respect to the environment,
* compute a dense 3D reconstruction of the surrounding scene,
* segment the obtained 3D model by using both geometry and semantics,
* use deep learning for empowering advanced scene understanding and improve SLAM performances.
We present emerging spatial AI techniques which have many potential applications, including mixed-reality, virtual
reality, and cognitive robotics. Hands-on use of the illustrated techniques, based on dedicated libraries for computer
vision such as OpenCV, applied to selected case studies.