Inside .lumen: How they scaled their annotation process

5 min read -
Avatar photo
- July 5th, 2024 -

.lumen’s mission is deeply personal for Cornel Amariei. As the only non-disabled member of his family, Cornel witnessed the challenges faced by his relatives, inspiring him to develop technology that improves their quality of life.

.lumen’s core product, the .lumen glasses, aims to address the limitations of current mobility solutions for the visually impaired with pedestrian autonomous driving AI. While guide dogs have traditionally been a key aid for the visually impaired, their availability is limited due to the high cost and extensive training required. Only around 28,000 guide dogs are available for over 300 million visually impaired individuals worldwide. This disparity shows the need for a scalable solution.

Custom hardware for real-time performance

A man working intently on electronics at a desk, surrounded by wires and equipment, in a dimly lit room with blue lighting.The .lumen glasses are equipped with an array of sensors, primarily cameras, which are crucial for mapping and navigating the environment. In addition to six cameras, the setup includes an RGB depth camera and infrared modalities to extract additional information.

The need for high performance and precise feedback drove the decision to use custom hardware. Standard smartphones, while powerful, cannot sustain the real-time processing required for .lumen’s application. Additionally, the interface for providing feedback in various situations needed to be seamless and reliable, necessitating a custom hardware solution.

Overcoming challenges with real-time data

Deploying real-time models on wearable devices is challenging. Much current research is moving towards LLMs and foundation models that do not run in real time. But to actually help people, .lumen needs everything to happen on the device in a split second. That’s why .lumen focuses on creating highly specific models tailored to their needs and utilizing the hardware to the maximum.

A hurdle in real-time navigation is integrating real-world data from cameras with information from online sources like maps and GPS to ensure seamless navigation, which enables planning and routing.

Optimizing the internal data annotation flow

A person walking on a forest trail, with an overlay of colored point cloud data visualizing sensor data in front of them.Accurate and extensive annotated data is essential for training the models that power the .lumen glasses. The team trains about ten models weekly.

After evaluating multiple platforms, .lumen chose for its stability and ease of use. The fit with the data management flow, including how easy it is to upload and retrieve data, also influenced their decision.

“There were platforms with, at first sight, a bit more functionality. However, when running an internal competition to test the efficiency and reliability of the different platforms, we found multiple of those features were unnecessary for what actually had to be accomplished. Even more so, some optional functionalities got in the way.”

Scaling and optimizing an internal annotation team

.lumen’s labeling team, comprising students and young professionals, has proven highly effective. This team, which started with the “brother of someone” has grown into a well-organized unit that consistently delivers high-quality annotated data. The team expanded by establishing a solid structure, processes, and guidelines, allowing for easy scaling.

The platform allows the team to quickly onboard new annotators and produce high-quality labels rapidly. The platform’s dashboard numbers help to keep track of performance.

“We learned a lot by building the data annotation team internally. We recently passed 300,000 semantic annotated images. Originally, we were scared we had to replace our team every three months. However, due to a strong company culture and values and the possibility of growing into other roles, many of the team members stayed on. People often join as a vacation job but end up staying. The number one reason people leave the data labeling team is because they’re promoted to other teams in the company.”

Collage of cityscapes with colored semantic segmentation overlays, showing labeled cities like Munich, Prague, Vienna, Budapest, Sofia, and Bucharest.

At first, the data the team had to annotate was not the most interesting. But now, the team annotates hundreds of kilometers of places around the world they have never traveled to from a first-person view.

The performance-based remuneration model has been particularly successful, allowing .lumen to reduce the team size while increasing productivity. This was an unexpected lesson, as the company didn’t start this way. By switching to performance-based remuneration, people who want to work have an above-country-average salary. They decrease the team to 2X and still increase performance by 2X.

Of course, there are still challenges. Everyone in the labeling team is used to semantic annotation, but now and then, you need something else, or you’ll go mad. But remuneration-wise, that is harder than semantic annotations, where the numbers for remuneration are clear.

Defining the right classes

One difficulty was determining which classes to label. Ambiguities often arose, causing issues for the navigation teams. Labeling a concept without a concrete real-life representation could lead to inconsistencies and compromise model accuracy.

A specific example: labeling data in historical urban centers, a common challenge in Europe. These areas often have streets that are difficult to categorize as either pedestrian or automotive. To address this, .lumen created a new classification—shared streets—and annotated 10,000 images accordingly. However, in the end, this approach failed for several reasons.

Ensuring quality in data annotations

Street scene with colored semantic segmentation overlays, showing a person, a car, buildings, and trees.Initially, the engineering team reviewed data to establish quality standards. Over time, experienced annotaters took over, maintaining standards through peer reviews. This system ensures consistency and leverages the expertise of seasoned annotaters. This peer review system not only maintains quality but also fosters a collaborative environment where knowledge and best practices are shared.

.lumen employs random sampling of labeled data for review by the machine learning team. This adds an extra layer of quality control. Responsibilities have gradually shifted to the data labeling team, creating a balanced workflow.

.lumen created a comprehensive labeling guide that is regularly updated to reflect new insights. A dedicated channel allows annotators to post questions and get prompt answers, ensuring clarity and accuracy in the labeling process.

Located just a few floors apart, the data labeling and machine learning teams can easily collaborate to resolve any issues. This proximity ensures that problems are addressed promptly and that the data labeling process runs smoothly.

For instance, a former data labeler who demonstrated exceptional skills was moved to the machine learning team. This individual now serves as a liaison, facilitating better understanding and cooperation between the two groups.

.lumen found that annotating sequential data by different people sometimes led to inconsistencies. To address this, they implemented a system where multiple people label a sequence, but a single person reviews the entire sequence. This approach ensures consistency across consecutive frames, as the same person maintains a uniform standard throughout the sequence.

In addition to human reviews, .lumen uses machine learning models to identify and flag potential annotation errors. These models analyze the labeled data and detect anomalies or inconsistencies, which are then reviewed by the team. This AI-driven quality check acts as an additional safeguard.

A visually impaired person using a DotLumen device crosses a street at a marked crosswalk.

Post-commercial release data annotation

As .lumen’s product reaches the market, data annotation will need to adapt. With devices deployed globally, smart data collection will become crucial. The process will involve optimizing how data is recorded, stored, and uploaded. Despite these changes, the fundamental annotation task remains the same: identifying and focusing on edge cases where the model performs sub-optimally. This means narrowing and filtering data to target specific edge cases more effectively.

A former Formula One track in Japan with a photorealistically painted road designed for pedestrians, resembling an actual road from certain angles.While the system works correctly in over 99% of scenarios, the remaining 1% presents unique challenges. These are situations where even humans struggle to interpret the data correctly. One notable example shared was from a former Formula One track in Japan. A road to the tribune was photorealistically painted and designed for pedestrians but could easily be mistaken for an actual road from certain angles. Such cases reveal the complexities in distinguishing between pedestrian and automotive paths, especially when marketing designs or unusual constructions are involved.

Integrating AI with classical computer vision and robotics is key to .lumen’s approach. This combination ensures high reliability and supports collaborative efforts among teams to enhance overall performance. Even if the navigation model only performs well in 80% of cases, other methods and technologies help cover the remaining challenges. This holistic approach underscores .lumen’s commitment to creating a robust product, leveraging AI as part of a broader, integrated system.