Dataleap @dataleapHQ X Profile

Dataleap

@dataleapHQ

Followers

541

Following

29

Media

1

Statuses

85

Empowering business teams to build agents without asking engineering for help

https://t.co/mLLAvBcNs1

San Francisco

Joined September 2022

Don't wanna be here? Send us removal request.

Dataleap

@dataleapHQ

3 years

Finding the right information from your long-term memory vector store is key for managing the context of an agent well. Trying different algorithms here can lead to significantly better results! Try it!

Andrej Karpathy

@karpathy

3 years

Random note on k-Nearest Neighbor lookups on embeddings: in my experience much better results can be obtained by training SVMs instead. Not too widely known. Short example: https://t.co/RXO9xiOmAB Works because SVM ranking considers the unique aspects of your query w.r.t. data.

0

5

Dataleap

@dataleapHQ

3 years

🧵1/ First things first: What's vector search? It's a technique to find similar data points within a data set. You might be familiar with LIKE queries in SQL - this is a similar concept, but with more complexity and possibilities. 🕵️‍♂️

1

10

Dataleap

@dataleapHQ

3 years

@pinecone @weaviate_io 8/ Querying an index is easy too. You can perform unary queries or query with a list of vectors. Just provide the vectors you want to find similarities for and the number of results you want returned. Voilà! 🎉

0

1

Dataleap

@dataleapHQ

3 years

@pinecone @weaviate_io 7/ Creating a vector index is simple with the right tools. Just import the necessary libraries, initialize your API key, and create an index. You can then insert your data as tuples containing the ID and vector representation of each object. 📚

1

0

1

Dataleap

@dataleapHQ

3 years

6/ Once you have a vector embedding, you need a way to run queries against it. That's where managed vector search like @pinecone or @weaviate_io comes in. You can store vector embeddings with IDs, tying data back to the objects they represent. 🔍

1

0

Dataleap

@dataleapHQ

3 years

5/ Obtaining a vector embedding requires ML techniques and a good understanding of the problem space. Image data is already in the form of vector embeddings, but other cases may need more work. 🧪

1

0

Dataleap

@dataleapHQ

3 years

4/ Take the phrase "one in a million." Is it more similar to "once in a lifetime" or "a million to one"? By creating vector embeddings, machine learning goes beyond human intuition to quantify similarity. 🧠

1

0

Dataleap

@dataleapHQ

3 years

3/ Vector embeddings are numerical representations of complex data. They make it easier to run generic ML algorithms on sets of data. The idea is to convert real-world objects into numerical representations to measure similarity. 📊

1

0

Dataleap

@dataleapHQ

3 years

2/ To perform vector search, we use "vector embeddings" which convert complex data into simpler representations. These embeddings help quantify similarity between data points while maintaining their deeper meaning. 📏

1

0

Dataleap

@dataleapHQ

3 years

🧵1/ First things first: What's vector search? It's a technique to find similar data points within a data set. You might be familiar with LIKE queries in SQL - this is a similar concept, but with more complexity and possibilities. 🕵️‍♂️

1

10

Dataleap

@dataleapHQ

3 years

1/ 🚨 Let's talk about AI Chatbot Security Risks: As companies race to deploy chatbots powered by large language models (LLMs), new security risks emerge that we need to address. Let's dive into these risks & discuss how we can approach them. Thread👇

1

0

Dataleap

@dataleapHQ

3 years

🧵 1/ Excited to share some tips on how to work with long documents leveraging @LangChainAI chains! Let's dive into the four common methods and their pros and cons. Remember, there's no one-size-fits-all solution – context is key! 🚀

1

2

0

Dataleap

@dataleapHQ

3 years

6/ ⚠️ The pressure to launch products without due diligence may lead to more AI chatbot misuse than necessary. However, prioritizing security & safe deployment over the competition is pretty challenging in this environment.

0

Dataleap

@dataleapHQ

3 years

5/ 🔒 It may take years before we establish best security practices for AI chatbots, and in the meantime, expect a surge of chatbot exploits & companies scrambling to fix them.

1

0

Dataleap

@dataleapHQ

3 years

4/ 🦠 Data Poisoning: Introducing malicious data directly into LLMs' training data can compromise their output. As LLMs are trained on web text, data poisoning isn't too challenging to execute, making it a significant concern. Could that even be

1

0