the community archive hackathon š„³
We organized a hackathon in NYC to build on community archive data, it was fun, wholesome, and productive!
We organized a hackathon in NYC with the goal of building tools or doing research on the twitter community archive data.
It ran for about 8h with ~30 participants and over 15 submissions, some of them remote! The vibes were immaculate and feedback was that generally people had a lot of fun.
Our submissions ranged from infrastructure improvements, to apps and tools, to data science-flavored research.
The archive is a really special dataset of (as of now) 16M tweets from some of the most thoughtful people on the internet. Itās also an ideal foundation to bootstrap a science of memetics from.
Iād never been in a place with so many people who āget itā and who were motivated and skilled enough to make contributions that actually advance our vision.
See our video for vibes.
Group pic!
Builds
Our prize structure was the following:
A first prize ($500) picked by our judges (xiq, defender, and tekne labs)
A second prize ($200) picked by visakanv (because heās the patron saint of our twitter)
3x popularity prizes 100$ voted on by participants
š„ 1st Prize: Ivan Vendrov - Tracking Meme Contagion Dynamics
Ivan demoed his analysis of the emergence of new words in the archive, and their pioneers.
The most viral tokens - tokens that nobody in the dataset tweeted prior to 2023 but many people used after, and their respective patient zeros and ones.
Here's a leaderboard of users ranked by how early they are to viral words, on average. @visakanv is by far the most common earliest word-user
Judges unanimously agreed Ivanās hack was the most in line with our goal of open memetics research. Check the live app, Ivanās thread, or the jupyter notebook for more depth.
š„ co-1st Prize: Priya Rose, Andrew Blevins, Ivan Vendrov: Rap Battle Generator
Ivan was also involved in the rap battle generator, which we thought was a really fun submission and an artsy intuitive counterbalance to the more quantitative contagion dynamics.
Check out the live app here, Priyaās live tweeted thread here, including a funny audio rap battle between Defender and I!!
š„2nd Prize: ???
š„3rd Prize: Henry: Related Tweets Browser Extension
Henry (@left_pad) built a browser extension that shows tweets that link to your current webpage.
Example using the infinite craft webpage.
Check out Henryās thread of hacks!
š„3rd Prize: Henry and Josh: Thermal Tweet Printing LIVE
Henry brought a thermal receipt printer and they hooked it up to the tweets coming live into the archive! Once the demo started the whole room lit up in wonder of tweets becoming physical artifacts!
Check out this sick video of the printer in motion.
š„3rd Prize: crinzo_: Infra to embed the whole database
Crinzo put together a way to easily extract embeddings using a cloud GPU for the entire archive, or for a specific user: https://github.com/enzokro/ca-embeds
This is a really helpful infrastructure contribution for us. It makes pre-made embeddings available to anyone who wants to build on them, allowing builders to skip the arduous process of embedding everything.
Explore and download the embeddings here. Read his thread for more detail.
Screenshot of the data explorer.
Christopher C. Smith: Chat with the Archive
Chris built a chat interface and agentic framework that lets an LLM make SQL queries to the archive in order to answer questions! Super cool!
Check out the repo and his twitter thread!
Christopher C. Smith: Social Network Graph Visualization
Chris also plotted a network visualization of the following graph of accounts in the archive. Unsurprisingly, Iāve got the highest PageRank score :D then Visa, then Defender.
Iām really excited about network visualizations, especially memetic diffusion in networks!
.
Network viz and three main clusters.
Read the thread, or check out the repo
Oliver King: Blind Spots - An Information Theory Perspective
Oliver investigates semantic space for underexplored regions and tries to generate some tweets to match.
We think this is a very exciting direction for memetics research!
Check out Oliverās presentation slides here.
Alan G: Difference between vibes sent and received
Alan G compares the sentiment of tweets with their sentiment weighted by engagement, demonstrating more extreme emotions reach more people.
EmergentVibe: Semantic Correlations
Emergentvibe participated remotely and built a tool that can generate any 2x2 using any semantic query, and plot every profile based on their tweets. E.g. you can plot how ātravelā and āhappyā correlate.
Plotting āfriendlyā x ānerdā
Check out Emergentās thread here with more hacks!
Henry: Vibe Search
Semantic search in the community archive!
Check out his repo!
Henry: Download .json button
Henry (who had 4!!! submissions) was kind enough to add a download raw archive button to profiles in the community archive website.
E.g. Emmetās profile.
Defender and IaimforGOAT: Community Archive Firehose
IaimforGOAT built a web extension that sends tweets from your browser to the community archive in +- real time.
Defender built a web page to view the tweets as they come in!
Erosika: Chat with a digi-clone!
Erosika demoed a chat interface with a digital clone based on an accountās tweets!
Chat to a clone of VividVoid!
xiq: Embedding viz for anyone (2D and 3D)
I embedded tweets, clustered them, labeled them with an LLM, plotted them in 2D, and then added a time axis. A lot of the code was reused from Birdseye.
2D plot for my clusters!
The Event
It ran from 11AM to 7PM, with about 6h of focused work, an opening ceremony, lunch, and demos.
We were graciously hosted at Fractal Tech, with prizes and pizza supported by Tekne Labs and OpenRouter LLM credits from Triplicate.xyz.
We prepared some build ideas ahead of time, and linked to docs in our luma event page.
We also got pizza!
Testimonials
I had so much fun, the vibes were immaculate. Thanks to everyone who came, to Fractal Tech, Tekne Labs, and Triplicate!
So cool to see!