Dataleap Profile
Dataleap

@dataleapHQ

Followers
541
Following
29
Media
1
Statuses
85

Empowering business teams to build agents without asking engineering for help

San Francisco
Joined September 2022
Don't wanna be here? Send us removal request.
@dataleapHQ
Dataleap
3 years
Finding the right information from your long-term memory vector store is key for managing the context of an agent well. Trying different algorithms here can lead to significantly better results! Try it!
@karpathy
Andrej Karpathy
3 years
Random note on k-Nearest Neighbor lookups on embeddings: in my experience much better results can be obtained by training SVMs instead. Not too widely known. Short example: https://t.co/RXO9xiOmAB Works because SVM ranking considers the unique aspects of your query w.r.t. data.
0
0
5
@dataleapHQ
Dataleap
3 years
🧡1/ First things first: What's vector search? It's a technique to find similar data points within a data set. You might be familiar with LIKE queries in SQL - this is a similar concept, but with more complexity and possibilities. πŸ•΅οΈβ€β™‚οΈ
1
1
10
@dataleapHQ
Dataleap
3 years
@pinecone @weaviate_io 8/ Querying an index is easy too. You can perform unary queries or query with a list of vectors. Just provide the vectors you want to find similarities for and the number of results you want returned. VoilΓ ! πŸŽ‰
0
0
1
@dataleapHQ
Dataleap
3 years
@pinecone @weaviate_io 7/ Creating a vector index is simple with the right tools. Just import the necessary libraries, initialize your API key, and create an index. You can then insert your data as tuples containing the ID and vector representation of each object. πŸ“š
1
0
1
@dataleapHQ
Dataleap
3 years
6/ Once you have a vector embedding, you need a way to run queries against it. That's where managed vector search like @pinecone or @weaviate_io comes in. You can store vector embeddings with IDs, tying data back to the objects they represent. πŸ”
1
0
0
@dataleapHQ
Dataleap
3 years
5/ Obtaining a vector embedding requires ML techniques and a good understanding of the problem space. Image data is already in the form of vector embeddings, but other cases may need more work. πŸ§ͺ
1
0
0
@dataleapHQ
Dataleap
3 years
4/ Take the phrase "one in a million." Is it more similar to "once in a lifetime" or "a million to one"? By creating vector embeddings, machine learning goes beyond human intuition to quantify similarity. 🧠
1
0
0
@dataleapHQ
Dataleap
3 years
3/ Vector embeddings are numerical representations of complex data. They make it easier to run generic ML algorithms on sets of data. The idea is to convert real-world objects into numerical representations to measure similarity. πŸ“Š
1
0
0
@dataleapHQ
Dataleap
3 years
2/ To perform vector search, we use "vector embeddings" which convert complex data into simpler representations. These embeddings help quantify similarity between data points while maintaining their deeper meaning. πŸ“
1
0
0
@dataleapHQ
Dataleap
3 years
🧡1/ First things first: What's vector search? It's a technique to find similar data points within a data set. You might be familiar with LIKE queries in SQL - this is a similar concept, but with more complexity and possibilities. πŸ•΅οΈβ€β™‚οΈ
1
1
10
@dataleapHQ
Dataleap
3 years
1/ 🚨 Let's talk about AI Chatbot Security Risks: As companies race to deploy chatbots powered by large language models (LLMs), new security risks emerge that we need to address. Let's dive into these risks & discuss how we can approach them. ThreadπŸ‘‡
1
1
0
@dataleapHQ
Dataleap
3 years
🧡 1/ Excited to share some tips on how to work with long documents leveraging @LangChainAI chains! Let's dive into the four common methods and their pros and cons. Remember, there's no one-size-fits-all solution – context is key! πŸš€
1
2
0
@dataleapHQ
Dataleap
3 years
6/ ⚠️ The pressure to launch products without due diligence may lead to more AI chatbot misuse than necessary. However, prioritizing security & safe deployment over the competition is pretty challenging in this environment.
0
0
0
@dataleapHQ
Dataleap
3 years
5/ πŸ”’ It may take years before we establish best security practices for AI chatbots, and in the meantime, expect a surge of chatbot exploits & companies scrambling to fix them.
1
0
0
@dataleapHQ
Dataleap
3 years
4/ 🦠 Data Poisoning: Introducing malicious data directly into LLMs' training data can compromise their output. As LLMs are trained on web text, data poisoning isn't too challenging to execute, making it a significant concern. Could that even be
1
0
0