Smart speakers should be able to adapt to their environment for optimizing sound reproduction [1]. To achieve this, it is necessary to estimate the geometry of the room in which they are placed. There are two main approaches to accomplishing this: The first approach is through acoustic methods. These methods involve sending out sound waves from the smart speaker and measuring the time it takes for the waves to return. By analyzing this data, the geometry of the room can be derived. Alternatively, geometry information can also be obtained through the use of images taken from different perspectives. This methods rely on fundamental principles of structure for motion, allowing for the derivation of a 3D model.
To contribute to the improvement of future smart speaker technology, your task is to compare these two modalities and identify their respective strengths and weaknesses. This analysis will serve as a foundation for finding clever ways to combine them in order to achieve higher quality reconstruction results.
You are interested in improving the technology of future smart speakers?
Then have a look at our offer!