datacebo Profile Banner
DataCebo Profile
DataCebo

@datacebo

Followers
99
Following
25
Media
42
Statuses
79

An MIT spin-off that's making synthetic data a reality.

Joined October 2020
Don't wanna be here? Send us removal request.
@datacebo
DataCebo
1 year
Excellent article in @Forbes today calling #syntheticdata โ€œan all-too-rare example ofโ€ฆgenuinely usefulโ€ generative AI, for the particular application of software testing. Read @jpwarren profile of @datacebo and @kveeramac : https://t.co/CpWQWShgFE #bigdata #syntheticdata
0
4
7
@datacebo
DataCebo
7 months
SDV Enterprise v0.24.0 is out ๐ŸŽ‰ This release adds features that help you generate higher quality synthetic data and improve ease-of-use. ๐ŸŒŸ Model hierarchical relationships in a table. Use the SelfReferentialHierarchy CAG pattern when you have a column in a table that
0
0
1
@datacebo
DataCebo
8 months
Today, weโ€™re excited to introduce a powerful new bundle to the @sdv_dev: AI connectors. AI connectors address 2 key challenges that SDV users face when training generative AI models on datasets from enterprise data stores. (Link to the announcement: https://t.co/rHXe13g1Ru) โŽ
0
0
1
@datacebo
DataCebo
8 months
SDV Enterprise v0.23.0 is out ๐ŸŽ‰ This release enhances your ability to program your synthesizer to find certain patterns and recreate themโ€” whether it's through multi-table CAG patterns, single-table constraints, or pre-processing techniques that transform your data. ๐Ÿ†
0
0
1
@datacebo
DataCebo
10 months
Today, we are excited to introduce a very powerful new framework to The Synthetic Data Vault : ๐—ฐ๐—ผ๐—ป๐˜€๐˜๐—ฟ๐—ฎ๐—ถ๐—ป๐˜ ๐—ฎ๐˜‚๐—ด๐—บ๐—ฒ๐—ป๐˜๐—ฒ๐—ฑ ๐—ด๐—ฒ๐—ป๐—ฒ๐—ฟ๐—ฎ๐˜๐—ถ๐—ผ๐—ป (#CAG for short). CAG addresses the shortcomings of generative models in capturing the context buried in enterprise data
1
1
4
@datacebo
DataCebo
1 year
Working with customers all over the world has taught us about one important, but often overlooked benefit of using #syntheticdata: increased data diversity. Data diversity refers to the overall variety of data that is accessible for a project. While it's a simple concept,
0
2
3
@datacebo
DataCebo
1 year
๐Ÿš€๐Ÿ”ฅ #CTGAN has been downloaded over 2.5 million times.ย ๐Ÿ”ฅ๐Ÿš€ Released #thisweek in 2019: version 0.1.0 of #CTGAN as part of The Synthetic Data Vault, a Deep Learning-based #syntheticdata generator for single-table data that can learn from real data and generate synthetic data
0
1
2
@datacebo
DataCebo
1 year
Upon popular demand we have added the ability to connect to databases to bring data to The Synthetic Data Vault (@sdv_dev ). Users can now directly connect #SDV Enterprise to their databases, both to import real data and to export #syntheticdata. We have added #bigquery and
0
0
0
@datacebo
DataCebo
1 year
#otd in 1998 Yann LeCun (@ylecun) submitted a paper on gradient-based deep learning for document recognition. It took more than a decade before the world finally warmed to neural networks. He has since had his paper cited roughly 70,000 times, and in 2018 won the Turing Award,
0
0
0
@datacebo
DataCebo
1 year
๐Ÿ† We are pleased to share that DataCebo has been awarded a contract by the U.S. Department of Homeland Securityโ€™s (@DHSgov ) under the call for a Synthetic Data Generator. With The Synthetic Data Vault (@sdv_dev ) the DHS will be able to build, deploy, and manage sophisticated
0
1
4
@datacebo
DataCebo
1 year
Born #otd in 1950: the Turing Test.ย  Alan Turing's paper from 74 years ago describes a modified version of the "imitation game" in which a human judge has to determine which of two typing partners is a computer. June 2024: In related news, one recent study found that human
0
0
2
@datacebo
DataCebo
1 year
One of our users exclaimed "These speedups are insane!" Our multi table synthesizer in SDV Enterprise, called HSA Synthesizer, runs in less than 1 minute what takes HMA Synthesizer an hour - across 20 datasets. โ‡๏ธ We have been focusing on multi table synthesizers.
0
1
5
@datacebo
DataCebo
1 year
In 1956, to store 5MB it required a hard disk that weighed a ton. In 2024 a #generativeai model can capture the salient properties of terabytes of data in an entire database within a single file and recreate it on demand - what we call #syntheticdata. #otd in 1956 IBM launched
storagenewsletter.com
IBM Corp. is just celebrating its 100th anniversary as the company was founded on June 16, 1911. Consequently we come back here on one of the greatest innovation in the history of Big Blue and the...
0
0
2
@datacebo
DataCebo
1 year
Happy birthday to the late Dennis Ritchie, inventor of C and co-creator of Unix. C and C++ have played a key role in the big data revolution, having been the origin languages for some of the core components of popular ML libraries, including #PyTorch and #TensorFlow. Multics,
0
0
0
@datacebo
DataCebo
1 year
#OTD in 2016 we submitted the final camera ready version of the Massachusetts Institute of Technology paper โญ๏ธ The synthetic data vault โญ๏ธ The paper said: "This synthetic data must meet two requirements: 1๏ธโƒฃ First, it must somewhat resemble the original data statistically, to
0
3
5
@datacebo
DataCebo
1 year
Launched 25 years ago this summer: VMware 1.0, the first commercial product that allowed users to run multiple operating systems as virtual machines on a single x86 machine. Later known as VMware Workstation, it was an influential application that provided a framework for cloud
0
1
1