How to chat with your PDFs using local Large Language Models [Ollama RAG]

День тому

In this tutorial, we'll explore how to create a local RAG (Retrieval Augmented Generation) pipeline that processes and allows you to chat with your PDF file(s) using Ollama and LangChain!
✅ We'll start by loading a PDF file using the "UnstructuredPDFLoader"
✅ Then, we'll split the loaded PDF data into chunks using the "RecursiveCharacterTextSplitter"
✅ Create embeddings of the chunks using "OllamaEmbeddings"
✅ We'll then use the "from_documents" method of "Chroma" to create a new vector database, passing in the updated chunks and Ollama embeddings
✅ Finally, we'll answer questions based on the new PDF document using the "chain.invoke" method and provide a question as input
The model will retrieve relevant context from the updated vector database, generate an answer based on the context and question, and return the parsed output.
TIMESTAMPS:
============
0:00 - Introduction
0:07 - Why you need to use local RAG
0:52 - Local PDF RAG pipeline flowchart
5:49 - Ingesting PDF file for RAG pipeline
8:46 - Creating vector embeddings from PDF and store in ChromaDB
14:07 - Chatting with PDF using Ollama RAG
20:03 - Summary of the RAG project
22:33 - Conclusion and outro
LINKS:
=====
🔗 GitHub repo: github.com/tonykipkemboi/olla...
Follow me on socials:
𝕏 → / tonykipkemboi
LinkedIn → / tonykipkemboi
#ollama #langchain #vectordatabase #pdf #nlp #machinelearning #ai #llm #RAG

КОМЕНТАРІ: 179

@levi4328 21 день тому

Im a medical researcher and, surprisingly, my life is all about pdfs i dont have any time to read; let alone learn the basics of code. And i think there's a lot of people on the same boat as mine. Unfortunately, its very fucking hard to actually find an ai tool thats barely reliable. Most of youtube is damped with sponsors for ai magnates trying to sell their rebranded and redudant worthless ai-thingy for a montlhy subscription or an unjustifiably costly api that follows the same premise. The fact that you, the only one that came closer to what i actually need - and a very legitimate need - is a channel with

@tonykipkemboi 21 день тому

Thank you so much for sharing about the pain points you're experiencing and the solution you're seeking. I'd like to be more helpful to you and many more like you as well. I have an idea of creating a UI using Streamlit for the code in this tutorial with a step-by-step explanation of how to get it running on your system. You will essentially clone the repository, install Ollama and pull any models you like, install the dependencies, then run Streamlit. You'll then be able to upload PDFs on the Streamlit app and chat with it on a chatbot like interface. Let me know if this will be helpful. Thanks again for your feedback.

@ilyassemssaad9012 19 днів тому

hey, hmu and ill give you my rag that supports multiple pdfs and you can choose the llm you desire to use.

@Aberger789 14 днів тому

I'm in the space as well, and am trying to find the best way to parse PDFs. I've setup grobid on docker and tried that out. My work laptop is a bit garbage, and being in the world's largest bureaucracy, procuring hardware is a pain in the ass. Anyways, great video.

@kumarmanchoju1129 13 днів тому

USe nvidia RTX chat for pdf summarizing and querying. Purchase a cheap RTX card of minimum 8GB vRAM.

@InnocentiusLacrimosa 12 днів тому

@@tonykipkemboiI think most people are in pain now with just this part "upload pdfs to service X". This is what they want/have to avoid. Anyhow, nice video you made here.

@gptOdyssey 13 днів тому

Clear instruction, excellent tutorial. Thank you Tony!

@tonykipkemboi 13 днів тому

Thank you for the feedback and glad you liked it! 😊

@chrisogonas 16 днів тому

Simple and well illustrated, Arap Kemboi 👍🏾👍🏾👍🏾

@tonykipkemboi 16 днів тому

Asante sana bro! 🙏

@johnlunsford5868 12 днів тому

Top-tier information here. Thank you!

@tonykipkemboi 12 днів тому

🙏

@claussa 23 дні тому

Welcome on my special list of channels I subscribe to. Looking forward to you making me smarter😊

@tonykipkemboi 23 дні тому

Thank you for that honor! I'm glad to be on your list and will do my best to deliver more awesome content! 🙏

@deldridg 2 дні тому

Thank you for this excellent intro. You are a natural teacher of complex knowledge and this has certainly fast-tracked my understanding. I'm sure you will go far and now you have a new subscriber in Australia. Cheers and thank you - David

@tonykipkemboi 2 дні тому

Glad to hear you found the content useful and thank you 🙏 😊

@n0madc0re 13 днів тому

this was super clear, extremely informative, and was spot on with the exact answers I was looking for. Thank you so much.

@tonykipkemboi 13 днів тому

Glad you found it useful and thank you for the feedback!

@ISK_VAGR 20 днів тому

Congrats man. Really useful content. Well explained and effective.

@tonykipkemboi 20 днів тому

Thank you, @ISK_VAGR! 🙌

@aloveofsurf 8 днів тому

This is a fun and potent project. This provides access to a powerful space. Peace be on you.

@tonykipkemboi 8 днів тому

Thank you and glad you like it!

@HR31.1.1 15 днів тому

Dope video man! Keep them coming

@tonykipkemboi 15 днів тому

Appreciate it!!

@Reddington27 22 дні тому

Thats a pretty clean explanation. looking for more videos.

@tonykipkemboi 22 дні тому

Thank you! Glad you like the delivery. I got some more cooking 🧑‍🍳

@DaveJ6515 8 днів тому

Very good! Easy to understand, easy to try, expandable ....

@tonykipkemboi 8 днів тому

Awesome! Great to hear.

@DaveJ6515 8 днів тому

@@tonykipkemboi you deserve it. Too many LLM UKpostsrs are more concerned to show a lot of things than to make them easy to understand and to reproduce. Keep up the great work!

@Mind6 9 днів тому

Very helpful! Great video! 👍

@tonykipkemboi 9 днів тому

🙏❤️

@konscious_kenyan 21 день тому

Good to see fellow Kenyans on AI. Perhaps the Ollama WebUI approach would be easier for beginners as one can attach a document, even several documents to the prompt and chat.

@tonykipkemboi 21 день тому

🙏 Yes, actually working on a Streamlit UI for this

@notoriousmoy 23 дні тому

Great job

@tonykipkemboi 23 дні тому

Thank you! 🙏

@grizzle2015 9 днів тому

thanks man this is extremely helpful!

@tonykipkemboi 9 днів тому

🙏🫡

@franciscoj.moyaortiz7025 День тому

awesome content! new sub

@tonykipkemboi День тому

Thank you! 🙏

@ninadbaruah1304 11 днів тому

Good video 👍👍👍

@tonykipkemboi 11 днів тому

@Marduk477 17 днів тому

Really userful content and well explained. t would be interesting to see a video but with different types of files, like only PDFs, for example Markdown, PDF, and CSV all at once. It would be very interesting.

@tonykipkemboi 14 днів тому

Thank you! I have this in my content pipeline.

@iceiceisaac 8 днів тому

so cool!

@tonykipkemboi 8 днів тому

Thank you 🙏

@Joy_jester 18 днів тому

Can you make one video of RAG using Agents? Great video btw. Thanks

@tonykipkemboi 18 днів тому

Sure thing. I actually have this in my list of upcoming videos. Agentic RAG is pretty cool right now and will play with it and share a video tutorial. Thanks again for your feedback.

@metaphyzxx 17 днів тому

I was planning on doing this as a project. If you beat me to it, I can compare notes

@rmperine День тому

Great delivery of material. How about fine-tuning for llama3 using your own curated dataset as a video? There are some out there, but your teaching style is very good.

@tonykipkemboi День тому

Thank you and that's a great suggestion! I'll add that to my list.

@georgerobbins5560 16 днів тому

Nice

@tonykipkemboi 16 днів тому

Thank you!

@rockefeller7853 25 днів тому

Thanks for the share. Quite enlightening. I will def build upon that. Here is the problem I have. Let's say Ihave two documents and I wanna chat with both at the same time (for instance to extract conflicting points between the two). What would you advise here?

@tonykipkemboi 24 дні тому

Thank you! That's an interesting use case for sure. My instinct before looking up some solutions is to maybe create 2 separate collections for each of the files then retrieve them separaetly and chat with them for comparison. I'm sure my suggestion above might not be efficient at all. I will do some digging and share any info I find.

@carolinefrasca 10 днів тому

🤩🤩

@unflexian 6 днів тому

Oh I thought you were saying you've embedded an LLM into a PDF document, like those draggable 3d diagrams.

@stanTrX 19 днів тому

Thanks, Can you please explain one by one and slowly. Especially the RAG part

@tonykipkemboi 18 днів тому

Thanks for asking. Which part of the RAG pipeline?

@thealwayssmileguy9060 17 днів тому

Would love it if can make the streamlit app! I am still struggeling to make a streamlit app based on open source llms

@tonykipkemboi 14 днів тому

Thank you! Yes, I'm working on a Streamlit RAG app. I have released a video on Ollama + Streamlit UI that you can start with in the meantime.

@thealwayssmileguy9060 14 днів тому

@@tonykipkemboi thanks bro! I will defo watch👌

@guanjwcn 23 дні тому

Thanks. Btw, how did you make your your UKposts profile photo? It looks very nice.

@tonykipkemboi 23 дні тому

Thank you! 😊 I used some AI avatar generator website that I forgot but I will find it and let you know.

@guanjwcn 22 дні тому

Thank you

@garthcase1829 21 день тому

Great job. Does the file you chat with have to be a PDF or can it be a CSV or other structured file type?

@tonykipkemboi 21 день тому

🙏 thank you. I'm actually working on a video for RAG over CSV. The demo in this tutorial will not work for CSV or structured data; we need a better loader for structured data.

@enochfoss8993 День тому

Great video! Thanks for sharing. I ran into an issue with a Chroma dependency on SQLite3 (i.e. RuntimeError: Your system has an unsupported version of sqlite3. Chroma requires sqlite3 >= 3.35.0). The suggested solutions are not working. Is it possible to use another DB in place of Chroma?

@tonykipkemboi День тому

Thank you! Yes, you can swap it with any other open-source vector database. You might also try using a more recent version of Python, which should come with a newer version of SQLite. Do you know what version you are using now? You can also try installing the binary version in the notebook like so: `!pip install pysqlite3-binary`

@deldridg 2 дні тому

Is it possible to upload multiple PDF documents using the langchain doc loaders and then converse across them? Excellent tut and thanks - David

@tonykipkemboi 2 дні тому

That can definitely be possible. Are you thinking of probably two pdfs that each carry different content?

@deldridg 2 дні тому

@@tonykipkemboi Thank you for taking the time to reply - much appreciated. I was just wondering whether this approach allows the ingesion of multiple documents which could be contrasted or used in conjunction with each other. Cheers mate - David

@farexBaby-ur8ns 14 днів тому

Good one.. ok you touched on security- you have here something that doesn’t let things flow out to the internet. I saw a bunch of vids about tapping data from dbs using sql agents. But none said specifically anything about security. So qn- does using sql agents violate data security?

@tonykipkemboi 13 днів тому

You bring up a critical point and question. Yes, I believe most agentic workflows currently, especially tutorials, lack proper security and access moderation. This is a growing and evolving portion of agentic frameworks + observability, IMO. I like to think of it as people needing special access to databases at work and someone managing roles and the scope of access. So agents will need some form of that management as well.

@DataScienceandAI-doanngoccuong 24 дні тому

Can this model query with tabular data or image data, can't it?

@tonykipkemboi 24 дні тому

I assume you're talking about Llama2? Or are you referring to the Nomic text embedding model? If it's Llama2, it's possible to use it to interact with tabular data by passing the data to it (RAG or just pasting data to the prompt) but cannot vouch for its accuracy though. Most LLMs are not great at advanced math but they're getting better for sure.

@ariouathanane 4 дні тому

Thank you very much for your videos. Please, what's if we have severals PDFs?

@tonykipkemboi 2 дні тому

Yes, so you can iteratively load the pdfs, chunk them by page or something else, then index them in a vector database. You would then ask your query like always and it would find the context throughout all the documents to give you an answer.

@xrlearn 24 дні тому

Thanks for sharing this. Very helpful. Also, what are you using for screen recording and editing this video ? I see that it records the section where your mouse cursor is ! Nice video work as well. Only suggestion is to increase gain in your audio

@tonykipkemboi 24 дні тому

I'm glad you find it very helpful. I'm using Screen Studio (screen.studio) for recording; it's awesome! Thank you so much for the feedback as well. I actually reduced it during editing thinking it was too loud haha. I will make sure to readjust next time.

@xrlearn 24 дні тому

@@tonykipkemboi Btw, can you see those 5 questions that it generated before summarizing the document?

@tonykipkemboi 24 дні тому

@@xrlearn, I'm sure I can. I will try printing them out and share them here with you tomorrow.

@tonykipkemboi 22 дні тому

Hi @xrlearn - Found a way to print the 5 questions using `logging`. Here's the code you can use to print out the 5 questions: ``` import logging logging.basicConfig() logging.getLogger("langchain.retrievers.multi_query").setLevel(logging.INFO) unique_docs = retriever.get_relevant_documents(query=question) len(unique_docs) ``` Here are more detailed docs from LangChain that will help. python.langchain.com/docs/modules/data_connection/retrievers/MultiQueryRetriever/

@yashvarshney8651 13 днів тому

could you drop a tutorial on building rag chatbots with ollama and langchain with custom data and guard-railing?

@tonykipkemboi 12 днів тому

That sounds interesting and something I'm looking into as well. For guard-railing, what are your thoughts on the frameworks for this portion? Have you tried any?

@yashvarshney8651 12 днів тому

@@tonykipkemboi realpython.com/build-llm-rag-chatbot-with-langchain/ I've reas this article, and the only guard-railing mech they seem to apply is an additional prompt with every inference.

@scrollsofvipin 16 днів тому

What GPU do you use ? I have Ollama running on an i5 intel with integrated CPU and so unable to use any of 3B + models. TinyLama and TinyDolphin works but the accuracy is way off

@tonykipkemboi 16 днів тому

I have an Apple M2 with 16GB of memory. I noticed that larger models slow down my system and sometimes force a shutdown of everything. One way around it is deleting other models you're not using.

@spotnuru83 11 днів тому

firstly thank you for sharing this entire tutorial, really great, i tried to implement it and got all the issues resolved but looks like i am not getting any output after i ask any question. i am OllaEmbeddings: 100% 5 times and then nothing happened after that, program just quit without giving any answer. will you be able to help me in this regards to see how to get it worked?

@tonykipkemboi 11 днів тому

Thank you for your question. Did you use the same models as in the tutorial, or did you use another one? Are you able to share your code?

@spotnuru83 10 днів тому

@@tonykipkemboi i copied ur code exactly

@spotnuru83 10 днів тому

the reason was i did not use jupiter notebook, i was running in VSCode and i had to save the value that is returned by chain's invoke method and when i printed , it started working, this is amazing.. thank you so much. really appreciate it.

@user-tl1ms1bq6n 14 днів тому

Pls provide notebook if possible. great video.

@tonykipkemboi 14 днів тому

Thank you! Checkout the repo link in the description for all the code. Here's the link github.com/tonykipkemboi/olla...

@hectorelmagotv8427 9 днів тому

@@tonykipkemboi hey, the link is not working, can provide it again pls?

@hectorelmagotv8427 9 днів тому

no problem, didnt see the description, thanks!

@tonykipkemboi 9 днів тому

@@hectorelmagotv8427 , thanks. Just to confirm, did it work?

@user-tl4de6pz7u 10 днів тому

I encountered several errors when trying to execute the following line in the code: data = loader.load() Despite installing multiple modules, such as pdfminer, I'm unable to resolve an error stating "No module named 'unstructured_inference'." Has anyone else experienced similar issues with this code? Any assistance would be greatly appreciated. Thank you!

@tonykipkemboi 10 днів тому

Interesting that's asking for that since that's for layout parsing and we didn't use it. Try installing it like so; "!pip install unstructured-inference"

@ayushmishra5861 23 дні тому

I've been given a story, the trojan war which is a 6 page pdf or I can even use the story as a text , also 5 pre decided question is given to ask based on the story, I want to evaluate different models answers but I am failing to evaluate even one, kindly help, please guide thoroughly.

@ayushmishra5861 23 дні тому

Can you please reply, would really appreciate that.

@tonykipkemboi 23 дні тому

This sounds interesting! I believe if you're doing this locally, you can follow the tutorial to create embeddings of the PDF and store it in a vector db then use the 5 questions to generate output from the models. You can switch the model type in between each response and probbly have to save each response separately so you can compare them afterwards.

@ayushmishra5861 23 дні тому

@@tonykipkemboi What amount of storage will the model take. I don't have greatest of the hardware.

@tonykipkemboi 23 дні тому

Yes, there are smaller quantized models on Ollama you can use, but most of them require a sizeable amount of RAM. Check out these instructions from Ollama on the size you need for each model. You can also do one at a time, then delete the model after use to create space for the next one you pull. I hope that helps. github.com/ollama/ollama?tab=readme-ov-file#model-library

@pw4645 4 дні тому

Hi, and if there were 6 or 10 PDFs, how would you load them into the RAG? Thanks

@tonykipkemboi 4 дні тому

Good question! I would iterate through them while loading them and also index the metadata so it's easy to reference which pdf provided the context for the answer. There's actually several ways of doing this but that would be my simple first try.

@erickcedeno7823 9 днів тому

Nice video. When i try to execute the following commands: !ollama pull nomic-embed-text and !ollama list. I receive the following error: /bin/bash: line 1: ollama: command not found

@tonykipkemboi 9 днів тому

This error is means that Ollama is not installed on your system or not found in your system's PATH. Do you have Ollama already installed?

@erickcedeno7823 5 днів тому

@@tonykipkemboi Hello, I've installed ollama in my local system but i don't know why i'm getting an error in google colab

@ayushmishra5861 3 дні тому

Retrieving answers from vector database takes good one minute on my macbook air, how do I scale this model, can you add pinecone layer to it?

@tonykipkemboi 3 дні тому

So this was a demonstration of running with everything local and nothing online other than when downloading the packages. You can hook up any vector store you like for example Pinecone as you've mentioned. Just beware that since the local models will still be in use, it will still be slow if your system is slow already. Might consider using paid services if you're looking for a lower latency solution.

@ayushmishra5861 3 дні тому

@@tonykipkemboi So tony, what I am trying to build is something like a website, where people come and drop there pdf's and can do Q and A. In my learning and implementation I found out. My 10 page pdf embedding generation is not taking a lot of time, it used to before using the embedding model you used. Now embedding part is sorted. I tried implementing the code with chroma and faiss, results are almost equal. Even for a small sized pdf, it takes a minute to answer. I understand it takes computational resource from my local machine, which happens to be a Macbook Air M1. Do you the a machine with better GPU, lets assume yours produce the retrieved results under 10 seconds? Nobody would like to wait a minute or more than a minute on website for an answer, also I am scared about the part if there are 100's of 1000's of user, do I need to purchase a GPU farm for this to work, lol. Note- I have never made a scalable project before. Please guide. Also share how much time it takes on your Pc/laptop for the answer to come back from the vector db, so I can understand if it's my system which is weak or libraries like chroma and faiss are not meant for scalability.

@ayushmishra5861 3 дні тому

@@tonykipkemboi .

@ayushmishra5861 2 дні тому

can anyone answer this please?

@tonykipkemboi 2 дні тому

@@ayushmishra5861 so my system is just like yours with 16GB RAM. It takes about a minute or less to get an answer back for a few pdf pages embedded. For longer ones, it even takes longer. One portion that slows the process is the "multiqueryretriever" which I added and talked about in the video. It generates 5 more questions and those have to get the context from the vector db as well which slows down the time to output significantly. Try without the multiqueryretriever and see if that speeds up your process.

@ruidinis75 8 днів тому

E do not need anAPI key for this ?

@tonykipkemboi 8 днів тому

Nope, don't need one.

@suryapraveenadivi851 14 днів тому

PDFInfoNotInstalledError: Unable to get page count. Is poppler installed and in PATH? please help with this........

@tonykipkemboi 14 днів тому

Are you doing a different modification of the code in the tutorial or using OCR? I would checkout the install steps on their repo here (github.com/Belval/pdf2image) and probably use ChatGPT for debugging as well.

@levsigal6151 5 днів тому

@@tonykipkemboi I've got the same error and I am using PDF file. Please advise.

@brianclark4639 7 днів тому

I tried the first command %pip install -q unstructured langchain and its taking a super long time to install. Is this normal?

@tonykipkemboi 6 днів тому

It shouldn't take more than a couple of seconds but also depending on your system and package manager, it might take a while. Did it resolve?

@Hoxle-87 6 днів тому

Where is the RAG part?

@tonykipkemboi 6 днів тому

The whole thing actually is RAG.

@Hoxle-87 6 днів тому

@@tonykipkemboi thanks. So adding the pdfs augments the LLM, got it.

@sebinanto3733 12 днів тому

hey,if we are using google colab,instead of jupyter,how will we able to corporate ollama with google colab?

@tonykipkemboi 12 днів тому

I haven't tried this myself but here are some resources for you that might be helpful; 1. medium.com/@neohob/run-ollama-locally-using-google-colabs-free-gpu-49543e0def31 2. stackoverflow.com/questions/77697302/how-to-run-ollama-in-google-colab

@suryapraveenadivi851 14 днів тому

ERROR:unstructured:Following dependencies are missing: pikepdf, pypdf. Please install them using `pip install pikepdf pypdf`. WARNING:unstructured:PDF text extraction failed, skip text extraction... please help

@tonykipkemboi 14 днів тому

Have you tried installing what it's asking for `pip install pikepdf pypdf`?

@suryapraveenadivi851 13 днів тому

@@tonykipkemboi Thank you so much!! for your reply this got resolved..

@tonykipkemboi 13 днів тому

@@suryapraveenadivi851 glad it worked! Happy coding.

@ayushmishra5861 23 дні тому

can I do this on google colab?

@tonykipkemboi 23 дні тому

This is local using Ollama so not possible following this specific tutorial. you can however use other public models that have API endpoints that you can call from Colab. I also want to mention that I have not explored trying to access the local models through Ollama using Colab.

@N0rt 18 днів тому

whenever I try pip install -q unstructured["all-docs"] (using win11) I keep getting a subprocess error: Getting requirements to build wheel did not run successfully. │ exit code: 1 any help appreciated!!

@N0rt 18 днів тому

AssertionError: Could not find cmake executable!

@tonykipkemboi 18 днів тому

Which Python version are you using? It seems there's an issue with latest Python versions and seems to work best with Python

@N0rt 16 днів тому

@@tonykipkemboi I'm on Python 3.12

@tonykipkemboi 16 днів тому

@@N0rt, I see. The unstructured package has had a lot of errors recently. You could try installing CMake on your Windows machine using the instructions below. Let me know if it works. 1. Go to cmake.org/download 2. Select Windows (Win32 Installer) 3. Run the installer 4. When prompted, select Add CMake to the system PATH for all users 5. Run the software installation 6. Double-click the downloaded executable file 7. Follow the instructions on the screen 8. Reboot your machine

@N0rt 14 днів тому

@@tonykipkemboi thank you so much ill give it a try!

@malleswararaomaguluri6344 14 днів тому

I want page source after the answer. How to add in the code

@tonykipkemboi 14 днів тому

You would need to save the responses from the vector store that matches the query then be able to add that as part of the response. There's probably a more efficient way to do it with the metadata but I haven't played around with it much. Will make sure to add that portion in my next RAG video. Thank you.

@malleswararaomaguluri6344 14 днів тому

@@tonykipkemboi thank you, because it will play a major role if we are handling with huge document or multiple documents.

@tonykipkemboi 14 днів тому

@@malleswararaomaguluri6344 absolutely 💯. Yes, being able to cite at page level and even maybe showing a snippet of the sources would be great.

@muhammedyaseenkm9292 7 днів тому

How can download unstructured[all-doc] I can not install this,

@tonykipkemboi 6 днів тому

Did you install it like this '!pip install --q "unstructured[all-docs]"'

@fahaamshawl9335 День тому

Is this scalable?

@tonykipkemboi День тому

To some extent but also your system setup and configuration is a major limiting factor

@okjustdoit 3 дні тому

Are you Kenyan?

@tonykipkemboi 3 дні тому

Yes, by birth. I'm located in the U.S. atm.

@maxmuster7003 15 днів тому

The pdf refuses to answer.

@tonykipkemboi 15 днів тому

Can you explain more about your process?

@maxmuster7003 15 днів тому

@@tonykipkemboi I tried to use chatgpt online, but it gave me wrong answers.

@tonykipkemboi 14 днів тому

Have you tried the way I did it in this tutorial?

@maxmuster7003 14 днів тому

@@tonykipkemboi My PC is broken, i am on an Android tablet.

@tonykipkemboi 14 днів тому

@@maxmuster7003 ah I see. Yeah, you have to be on a computer due to memory.

@aashishmittal206 12 днів тому

I have a 10 year old PC which has 8GB RAM. Can I use local AI on this machine or do I have to buy a new machine to get it working?

@InnocentiusLacrimosa 12 днів тому

That sounds really bad. Why do that to yourself? Anyhow, at that point I would just use Google Colab or something similar: you do not have cuda capable GPU, your cpu is bad and your RAM is barely enough to run the operating system.

@aashishmittal206 12 днів тому

@@InnocentiusLacrimosa Is Google colab free?

@tonykipkemboi 12 днів тому

Yes, Colab is free and can get you going but you would still need to download Ollama and am not sure you have enough RAM to get any models going.

@aashishmittal206 12 днів тому

@@tonykipkemboi If I were to buy a new PC, what should be the specs which is future proof for the next 5 years if all I do is code and do Adobe applications as well? And by coding, I meant using LLMs locally among other things.

@InnocentiusLacrimosa 11 днів тому

@@aashishmittal206 You are talking about a desktop, not a laptop? If I were seriously into LLMs, I would try to get an used 3090 as those sell for 600-700 USD/EUR and are the cheapest entry to 24GB VRAM. If that is not an option (too expensive) then the next tier down in cost is 12GB cards and RTX 3060 is the cheapest of those. That is enough to run perhaps the 7 Billion parameter models (?). At least 32GB RAM and a modern CPU (13600K or R5 7600) is the practical entry point to modern CPUs. Without a budget/needs it is really hard to give more advice. You talk about future proofing and doing coding, LLMs, but your current PC kind of states that you are not making money from that work so it is either a study or a hobby. You could be from anywhere and have access to different hardware sets, etc. If budget is not an issue, one could easily just get a 4x3090 or 4x4090 + 14900K/7950X + 128GB RAM or top of the line Macbook with M3 Max and 128GB unified RAM. To invest something into those, one needs to know that it is an investment that pays itself back (rather fast). But then again if you are on that space, you would probably be building solutions based on cloud computing again :-D