Training real AI with fake data

AI systems have an endless appetite for data. For an autonomous car’s camera to identify pedestrians every time — not just nearly every time — its software needs to have studied countless examples of people standing, walking and running near roads.

Yes, but: Gathering and labeling those images is expensive and time consuming, and in some cases impossible. (Imagine staging a huge car crash.) So companies are teaching AI systems with fake photos and videos, sometimes also generated by AI, that stand in for the real thing.

The big picture: A few weeks ago, I wrote about the synthetic realities that surround us. Here, the machines that we now rely on — or may soon — are also learning inside their own simulated worlds.

How it works: Software that has been fed tons of human-labeled photos and videos can deduce the shapes, colors and movements that correspond, say, to a pedestrian.

But there’s an ever-present danger that the car will come across a person in a setting unlike any it’s seen before and, disastrously, fail to recognize them.
That’s where synthetic data can fill the gap. Computers can generate millions of scenes that an actual car might not experience, even after a million driving hours.

What’s happening: Startups like Landing.ai, AI.Reverie, CVEDIA and ANYVERSE can create super-realistic scenes and objects for AI systems to learn from.

Nvidia and others make synthetic worlds for digital versions of robots to play in, where they can test changes or learn new tricks to help them navigate the real world.
And autonomous vehicle makers like Waymo build their own simulations to train or test their driving software.

Synthetic data is useful for any AI system that interacts with the world — not just cars.

In health care, made-up data can substitute for sensitive information about patients, mirroring characteristics of the population without revealing private details.
In manufacturing, “if you’re doing visual inspection on smartphones, you don’t have a million pictures of scratched smartphones,” says Andrew Ng, founder of Landing.ai and former AI head of Google and Baidu. “If you can get something to work with just 100 or 10 images, it breaks open a lot of new applications.”
In robotics, it’s helpful to imitate hard-to-find conditions. “It’s very expensive to go out and vary the lighting in the real world, and you can’t vary the lighting in an outdoor scene,” says Mike Skolones, director of simulation technology at Nvidia. But you can in a simulator.

“We’re still in the early days,” says Evan Nisselson of LDV Capital, a venture firm that invests in visual technology.

But, he says, synthetic data keeps getting closer to reality.
Generative adversarial networks — the same AI technology that drives most deepfakes — have helped vault synthetic data to new heights of realism.

Top News

US watchdog issues final rule to supervise Big Tech payments, digital wallets

Amazon doubles down on AI startup Anthropic with another $4 bln

McDonald’s is giving its menu the biggest shakeup in years

Stock Watch

Is It Too late to Buy Bitcoin as It Threatens $100,000?

Oil gains more than 5% for the week as Ukraine war intensifies

Greenlight’s David Einhorn says the markets are broken and getting worse

Technology

OpenAI wants Samsung to use its ChatGPT features for Galaxy AI

World’s 1st silicon anode EV battery will let you drive up to 186 miles after just 5 minutes of charging

You can now try Microsoft’s Recall AI feature on a Copilot Plus PC

Personal Finance

A Big Change Is Coming to 401(k)s in 2025. Here’s What You Need to Know.

These key 401(k) plan changes are coming in 2025. Here’s what savers need to know

More than half of non-retired US adults expect to rely on Social Security in retirement

Training real AI with fake data

News Team

Leave a Reply

Sharing

Leave a Reply