How To Build Generative AI Models Like OpenAI's Sora

  Переглядів 73,197

Y Combinator

Y Combinator

День тому

If you read articles about companies like OpenAI and Anthropic training foundation models, it would be natural to assume that if you don’t have a billion dollars or the resources of a large company, you can’t train your own foundational models. But the opposite is true.
In this episode of the Lightcone Podcast, we discuss the strategies to build a foundational model from scratch in less than 3 months with examples of YC companies doing just that. We also get an exclusive look at Open AI's Sora!
Read more about the YC AI companies from this episode on our blog: www.ycombinator.com/blog/buil...
Chapters (Powered by bit.ly/chapterme-yc) -
00:00 - Coming Up
01:13 - Sora Videos
05:05 - How Sora works under the hood?
08:19 - How expensive is it to generate videos vs. texts?
10:01 - Infinity AI
11:23 - Sync Labs
13:41 - Sonauto
15:44 - Metalware
17:40 - Guide Labs
19:29 - Phind
24:21 - Diffuse Bio
25:36 - Piramidal
27:15 - K-Scale Labs
28:58 - DraftAid
30:38 - Playground
33:20 - Outro

КОМЕНТАРІ: 85
@chapterme
@chapterme 29 днів тому
Chapters (Powered by ChapterMe) - 00:00 - Coming Up 00:49 - Intro: Generative AI for Video 01:13 - Sora Videos 05:05 - How Sora works under the hood? 08:19 - How expensive is it to generate videos vs. texts? 08:55 - How do YC companies build foundation models with just $500K? 10:01 - Demos: Infinity AI 11:23 - Sync Labs' hack to train a Lip Sync Model with a single A100 GPU 12:45 - YC deal with Azure 13:41 - How Sonauto Built a Text-to-Song Model 15:44 - Metalware: Hardware Co-Pilot 17:40 - Guide Labs: Explainable Foundation Model 18:20 - Building your own models vs. Using existing open source models 19:29 - Phind's Clever Hack: Synthetic Data 22:03 - Simulating real-world physics: Atmo (Foundational model for weather prediction) 24:21 - AI in Biology: Diffuse Bio 25:36 - Piramidal: Foundational model for the human brain 27:15 - AI in Robotics: K-Scale Labs 28:58 - DraftAid: AI Models for CAD Design 30:38 - Playground going against giants and Suhail Doshi Background 31:42 - Companies pivoting into AI 32:44 - Takeaway Message 33:20 - Outro
@avi2125
@avi2125 29 днів тому
The text/prompt for the video was quite detailed n informational. Even as a bad programmer I was able to mentally construct an algorithm for a video on the fly...maybe I have to watch this podcast more than the first 5 mins to understand why Sora etc is a big deal...
@BrianMPrime
@BrianMPrime Місяць тому
The lipsynching on Tim Ferriss looked way off. There was a bit of an uncanny valley with the deepfake switchover as well.
@danielmarco7863
@danielmarco7863 Місяць тому
This is definitely a launched product that the founders are embarrassed by. In the sense that they understand this is not representative of the final product, which many will suggest is indicative of the proper time to launch. Definitely applying the "law of papers" to my understanding of the state of the art video generation.
@jks234
@jks234 Місяць тому
Interestingly... the podcast's lipsyncing is also a bit off already. So perhaps it's just an audio sync issue.
@joythought
@joythought Місяць тому
​@@jks234 yes, YT is terrible for lipsync at times so probably best to download the episode and then watch as a local copy to have some hope of seeing it the way they saw the demo.
@BrianMPrime
@BrianMPrime 29 днів тому
@@danielmarco7863 I appreciate that attitude towards building, kudos to the team for launching early!
@jks234
@jks234 Місяць тому
20:15 I personally find the concept of synthetic data to be a fascinating spur for more neuroscientific research. People dream about what they study and are constantly reviewing problems they are working on in their head. In other words, I feel that humans use simulations in their own mind to build out the models they use to understand their world. We might be able to think of this as "generating 1000x more data" than can is directly extracted from the real world. Another example of this that was done to awesome effect is AlphaGo's self-play training.
@andybrice2711
@andybrice2711 27 днів тому
I would maybe argue Synthetic Data isn't inherently circular, it's just inverted. Whenever you've got a transformation which is easy in one direction, but difficult in the other. Synthetic Data is a sensible approach. Like it's easy to rasterize vector graphics, but it's more difficult to vectorize raster graphics.
@juanortega7509
@juanortega7509 Місяць тому
I've been waiting for a new episode for weeks!! Thanks for the content guys!
@alejandroVigano
@alejandroVigano Місяць тому
Thanks for sharing this talks!
@alicapwn
@alicapwn 21 день тому
They didn’t source robotics papers for Sora’s architecture. They combined Diffusion Transformers (developed by Peebles) with the video diffusion methods released by Stability/Google/Meta/Nvidia.
@samshoman
@samshoman Місяць тому
Wow, the song startup is better than anything I have seen so far.
@atchutram9894
@atchutram9894 18 днів тому
11:40 Hindi demo is perfect. My first language is not Hindi but can definitely tell it is great translation.
@DevilerServinal
@DevilerServinal Місяць тому
Thank you so much!!!!!!!!!!!!
@DiasporaPay
@DiasporaPay 18 днів тому
This is awesome thanks!
@theni3762
@theni3762 28 днів тому
All you're really saying here is that people can build any foundational models as long as openai doesn't also do it. That's not very reassuring to hear. We started with words, now pictures and videos, why would anyone not expect music, robotics, hardware etc down the line?
@bahlechonco211
@bahlechonco211 Місяць тому
Great insight
@fil4dworldcomo623
@fil4dworldcomo623 Місяць тому
I think Sora is better positioned on imagining a new world and totally a different world than to simulate our perception of what the world is and what the world was.
@awesomeo4510
@awesomeo4510 Місяць тому
Yes but how do you find the datasets to train for new foundational models? Like their EEG example - how did she acquire this data to train the models?
@LuisPerez-uh9ik
@LuisPerez-uh9ik Місяць тому
Just take it!
@joythought
@joythought Місяць тому
Isn't she an expert in the field with papers published in Nature? If so, she has the data. If you want similar data you need to partner with researchers.
@minc33
@minc33 Місяць тому
Where there’s a will, there’s a way!
@sergismael
@sergismael Місяць тому
best episode so far.
@pandainvestingco
@pandainvestingco Місяць тому
I love this series
@sgdfly8715
@sgdfly8715 29 днів тому
An idea that anyone can take (though it might already exist): Use AI to help recreate crime scenes and make recommendations on what data might help better understand and solve cases. The ideal solution would be able to use data from other cases in order to improve recommendations.
@jess-e
@jess-e 29 днів тому
Who can share the papers which are necessary to get to a level of understanding that is actionable? As explained in the video :)
@AdityaVG10
@AdityaVG10 27 днів тому
I have been looking for those papers ! Tell me if you get some .
@AfeezAbdulAziz
@AfeezAbdulAziz 25 днів тому
@@AdityaVG10me too! I’m still finding out about this
@gibsonhu6502
@gibsonhu6502 29 днів тому
Are there links to the sora videos they are showing?
@FunwithBlender
@FunwithBlender Місяць тому
Alibaba is also doing some interesting things with AI video, we (open source community) have almost destructured the process.
@kog0824
@kog0824 29 днів тому
M 17:20 here seems an interesting approach… but sorry that I am new to this AI space, what does it mean by building its own foundation model but with gpt2.5. Does it mean it fine tune through gpt2.5 with its own data?
@fortunefubara1244
@fortunefubara1244 25 днів тому
Yes.
@Alice8000
@Alice8000 28 днів тому
NICE VIDEO MY FRIENDS
@vikalpjain1098
@vikalpjain1098 16 днів тому
At 4:17 to 4:20 in one of the column one ladder joint got added.
@xilluminati
@xilluminati Місяць тому
̶f̶ i̶r̶s̶t̶…. no… early adopter
@pandainvestingco
@pandainvestingco Місяць тому
😂
@raymond_luxury_yacht
@raymond_luxury_yacht Місяць тому
interesting that raytracing in games might be done and games will be diffused not rendered
@FunwithBlender
@FunwithBlender Місяць тому
the lipsync has some better open source free solutions but still cool
@FunwithBlender
@FunwithBlender Місяць тому
Respectfully stable diffusion is way better than anything else to act like mid journey or playground is better is to not understand the flexibility and creativity you have with stable diffusion. Stable diffusion can combine with control net there is a massive community Civitia with LoRA and textual inversion etc and there is a thousand tings you can do from deforum to you name it. Stable diffusion is the only model that can give you precision when needed if you know how to use it, yes its more complex but it is the best model
@JohnSmith-he5xg
@JohnSmith-he5xg 27 днів тому
12:40 Really burying the lede here to the question "How are YC companies able to create these models with only $500k?" We arranged for free compute with MSFT (she didn't say how much, but said hundreds of times more than they'd get otherwise)
@adiveena
@adiveena 28 днів тому
How to work this type startup
@rcstann
@rcstann Місяць тому
¹1¹! It's "Sam" day in the Bay area.
@AM-kx4ue
@AM-kx4ue 13 днів тому
Hi everyone, I'm exploring how startups are balancing AI model training with customer data privacy, especially in competitive industries where data can make a difference against competitors. If you have insights or experiences to share on anonymization techniques, federated learning, differential privacy, or service models with privacy tiers, I'd love to hear from you. Let's discuss this further and exchange strategies for responsible AI development.
@rodi4850
@rodi4850 Місяць тому
4:47 there's tons of videos of the golden gate in 360 - gaussian splatting can do it much better 😁
@jks234
@jks234 Місяць тому
15:04 memeworthy clip
@reza2kn
@reza2kn Місяць тому
I appreciate the show and encouraging people to go for it, and I get hyping up the early YC-backed products, but the first couple weren't even super impressive by March 2024 standards, let alone being "the best thing" on the market. I'm not bashing any of the products and I hope they do awesome, I'm just saying these are not at all good examples of "the best we have right now", and is discouraging to hear from you guys. @ 11:42 The lip sync is completely off. This while perfecting lip sync motion was already accomplished last year. @15:40: Check out Suno AI v3. That's like GPT-4 compared to GPT-2 (what you showed here)
@LuisPerez-uh9ik
@LuisPerez-uh9ik Місяць тому
They also are young founders. Looks to me like they are pushing this to encourage ai in yc
@fanaccount6600
@fanaccount6600 Місяць тому
why is that cup on the ground instead of being on the table?!
@vslaykovsky
@vslaykovsky Місяць тому
this is an AI-generated video, that's why
@swaggitypigfig8413
@swaggitypigfig8413 Місяць тому
So they can grab it with their toes and fling it towards each other as a conflict resolution technique.
@shallindurani
@shallindurani Місяць тому
I wonder what the dog thinks about him lol
@harshitgauravtiwari
@harshitgauravtiwari Місяць тому
What if this video also is ai generated
@harshitgauravtiwari
@harshitgauravtiwari Місяць тому
Omg i am the first to comment I have startup in semiconductors Hope someday will meet with Y combinators 😊
@john-kv7kl
@john-kv7kl Місяць тому
bruh it is ai generated. 10:33
@joythought
@joythought Місяць тому
This comment is AI generated.
@FunwithBlender
@FunwithBlender Місяць тому
Okay I am sold on Y C lol will submit my application, access to GPU's for fine tuning is valuable
@shrawanthakur4168
@shrawanthakur4168 28 днів тому
It’s just the start of the AI and a lot of Sci-Fi things becoming real.
@pauldannelachica2388
@pauldannelachica2388 Місяць тому
❤❤❤❤❤❤
@GigaFro
@GigaFro Місяць тому
Seeing one example of the generated spelling being correct or even a few does not mean there was any advancement in this area...
@perrssssjjwjwkriri883
@perrssssjjwjwkriri883 29 днів тому
No way u dont kno who that is 11:53
@crowsnest6753
@crowsnest6753 25 днів тому
the use case is clearly VR gaming. Next stop - VR movies
@nischalnayak391
@nischalnayak391 Місяць тому
Great ! I watched this video to relealise i need millions of free credit to build a foundational model for free
@0x0michael
@0x0michael 29 днів тому
What sora imagined was a single-laned residential street, lots more space for trees, gardening, walking and for neighborhood activities. Cars move one-way in from one direction and out in the opposite.
@saravanashanmukham6108
@saravanashanmukham6108 23 дні тому
Inspiring to know AI barrier can be overcome without a PhD in ML/AI. Thanks guys!
@vincentwady
@vincentwady Місяць тому
Let’s push 100% AI to the market. There should not be single human needed for a corporation after that.
@FunwithBlender
@FunwithBlender Місяць тому
I hope playground wins though the more competition the better
@Cygx
@Cygx Місяць тому
Feels like I’m sitting in listening to the four smartest kids in my class XD
@jeffsteyn7174
@jeffsteyn7174 28 днів тому
Looking down on synthetic data makes no sense. Models like orca was built on synthetic data and it outperforms models 10x its size.
@gunaysoni6792
@gunaysoni6792 15 днів тому
The models you showcased today aren't really "foundational models" (at least in the way the term is currently used.) and a lot of what you show isn't super new. Saying that you don't need a lot of GPU's to compete is very misleading.
@kamal_pratap
@kamal_pratap Місяць тому
the hell?
@Authormatthewtaylordotcom
@Authormatthewtaylordotcom 18 днів тому
Thanks for sharing! Love the content. Any great repositories for the latest academic papers/journals to read up on as mentioned near the end?
@Alice8000
@Alice8000 28 днів тому
I hope you guys are very successful so you can buy some furniture! lol jk bro. just a prank bro.
@Mooohbroadcast
@Mooohbroadcast 29 днів тому
Thanks for sharing one more useless hype. You jumped from blockchain to crypto, NFT, and finally to AI. You should change your brand name in Y Hype 💩
@rodi4850
@rodi4850 Місяць тому
A guy not speaking Hindi gives his opinion on an lip sync model speaking Hindi 😂
@alexanikiev
@alexanikiev 27 днів тому
This comment alone is a “great” example of stereotypical thinking. The problem is that we are already living in the 21st century and people speaking 3-4 languages on a daily basis is pretty expected 8)
@tf_9047
@tf_9047 Місяць тому
AI, even at current levels of capability, is far too dangerous to our society to be released to startups or governments or businesses or the public. We need startups to tackle the safety of these models at a more aggressive rate than capabilities advances.
@ashleigh3021
@ashleigh3021 Місяць тому
People limiting AI are extremely dangerous. We need rule of law to tackle Luddism in the public and protect technology from ignorance.
@joythought
@joythought Місяць тому
Seriously, how would a start up solve the alignment problem when that is out of their hands. Better for them to do new things building new models. The great thing about human agency is it's almost unstoppable. The great thing about AI agency is it can be switched off. Anyone fearing the rise of the machines has no idea how much power that is going to draw. Simple enough to switch off at the mains.
@djpete2009
@djpete2009 29 днів тому
@@ashleigh3021 Can you imagine? Crazy people!!
@GatherVerse
@GatherVerse 26 днів тому
If you really want to add value to this podcast why not add a black person to the conversation? We reccommend Christopher Lafayette. He's in the Valley and can contribute well to this conversation and draws an audience. Else, find someone else, but consider the upside to this. Thanks.
The Truth About Building AI Startups Today
32:27
Y Combinator
Переглядів 342 тис.
Avoid These Tempting Startup Ideas
29:00
Y Combinator
Переглядів 456 тис.
I PUT MY ARMOR ON (Creeper) (PG Version)
00:19
Sam Green
Переглядів 2,8 млн
skibidi toilet 73 (part 1)
04:46
DaFuq!?Boom!
Переглядів 29 млн
Спаси её волосы🙏🏻
00:40
БРУНО
Переглядів 1,6 млн
Andrew Ng: Opportunities in AI - 2023
36:55
Stanford Online
Переглядів 1,7 млн
Generative AI in a Nutshell - how to survive and thrive in the age of AI
17:57
Inside The Hard Tech Startups Turning Sci-Fi Into Reality
48:37
Y Combinator
Переглядів 43 тис.
AI Generated Videos Just Changed Forever
12:02
Marques Brownlee
Переглядів 8 млн
Mark Zuckerberg - Llama 3, $10B Models, Caesar Augustus, & 1 GW Datacenters
1:18:38
How to Get and Evaluate Startup Ideas | Startup School
32:22
Y Combinator
Переглядів 623 тис.
Sam Altman & Brad Lightcap: Which Companies Will Be Steamrolled by OpenAI? | E1140
53:07
20VC with Harry Stebbings
Переглядів 146 тис.
The Race For AI Robots Just Got Real (OpenAI, NVIDIA and more)
21:26
Нужен ли робот пылесос?
0:54
Катя и Лайфхаки
Переглядів 753 тис.
#smartphone #screenprotection #tech #shorts #magicjohn
1:01
MagicJohn
Переглядів 6 млн
Phone charger explosion
0:43
_vector_
Переглядів 3,4 млн
Это БЕСИТ ВСЕХ пользователей iPhone!!! 😡
28:07
Яблочный Маньяк
Переглядів 27 тис.