Datagen Technologies, the maker of a synthetic data platform that generates visual data for the training of computer vision systems, today said it closed a $ 50 million Series B financing round, bringing its total funding to $ 70 million.
The company is creating a computer vision stack that can be used to simulate real-world images for use in training machine learning models. Its self-service offering allows engineers to adopt synthetic data by creating faces and full-body simulated data for an hourly charge.
The Tel Aviv-based company said it has seen an eightfold growth in revenue over the past year, helped by the adoption of its platform by three of the top five technology companies.
Lack of training data is a significant impediment to the development of machine learning models in general and computer vision in particular.
“If you want to place a dashboard camera in your car to detect driver drowsiness you need to train an AI model that represents hundreds of millions of images of people of different ethnicities doing different things in different weather conditions and car types,” said co- founder and Chief Executive Ofir Chakon. “You need to spend months or years capturing images and then annotate each one according to where the face is, where the eyes are and this is a completely manual process.”
Small sample, big dataset
Datagen uses computers to represent real-world scenarios. That requires a small amount of training data captured by scans, which is then used to create thousands or millions of variations. “We can take a few hundred scans of humans and turn them into a wide variety of types,” Chakon said. “When we build the simulation we know exactly where they’re located, the reflections of the light, where the eyes are looking and what people are holding. All this metadata can be output without the need for labeling. ”
Customers can generate millions of variants in a few hours with the cloud service. “If you want to create millions of tables, you don’t need to model a million, you basically need 10 of different types and shapes,” Chakon said.
Gartner Inc. last year estimated that “60% of the data used for the development of AI and analytics projects will be synthetically generated.”
Datagen has been selling primarily to tech companies and development teams in the automotive, conferencing, augmented and virtual reality, and home security markets, but can adapt its technology to other domains. “You can ask for specific objects, environments and interactions and they will be available in a matter of days,” Chakon said.
The funding will be used primarily for product development, sales and marketing. “We will add more domains to the data generation layer and also build more solutions on top of the layer to go beyond the training phase to include testing and enrichment of existing data,” Chakon said.
Funding was led by new investor Scale Venture Partners, with participation from existing investors TLV Partners LLC, Viola Group Inc. and Spider Capital Partners Management LLC.