Bananaphone Semantics
Twitter's working on something called Annotations, and I'm very excited about it. At the moment, this may only be interesting to techies, but if it goes the way I think it will, it'll be useful and understandable for everyone.
I got excited after reading Raffi Krikorian's post offering an "Extremely preliminary look at Twitter's Annotations". I'd heard about them when they were first announced, but this time it really clicked for me what they are.
Annotations are just arbitrary metadata, shoved inside a tweet. Maybe you include a URL to a picture in the annotations, instead of taking up your precious 140 with the link. Maybe your tweet said that you lost weight today, and so you attach a list of your weights for each day in the past week. There are plenty of possibilities It's just named hashes with key/value pairs. Do whatever.
This data won't necessarily even be visible in most Twitter apps, but it'll be there. At first, it'll be used for little niche things, just like hashtags are. But once certain groups of data start getting used more and more, people will organically standardize on certain things (like ways to link in podcasts and photos, or maybe even how to attach other identities, like your Facebook username). And as they do, different apps will find ways of making use of it, and we will have one seriously diverse soup of data flowing through Twitter's pipes.
What this is, is Twitter is setting up the tiniest framework possible, and bowing out of deciding exactly what data can and can't be on a tweet. People will just figure it out. It's an approach to standards that I like. It's also a huge validation of their choice to switch to a schemaless database, where storing nested hashes, and making them searchable and indexable, is like breathing.
While I think this will be great for Twitter, what I like most about it is using it as a model for how the Web at large should evolve. I really appreciate this excerpt from What Twitter Annotations Mean from ReadWriteWeb, which also quotes Krikorian:
"People who believe in building standards are conerned about our blase attitude about how we want to run annotations," Krikorian says. He believes that the developer community will work things out for itself, just as it has in the past. "There has been a lot of emergent behavior around how to relate to tweets anyway, without our imposing much structure around it. The Twitter platform is continuously evolving - the developers will figure it out. Twitter developers iterate in public."
That's likely to be cold comfort for people focused on the power of structured data standards. Many people are calling for Twitter to embrace the well-built efforts of the Semantic Web community. Krikorian says that 90% of Twitter developers don't know what the Semantic Web is but that there's certainly room for standards lovers to work within the Annotations scheme.
The best standards emerge organically, even after great frustration. HTML5 despite emerging from a standards body, is a reflection of how people constantly mangled HTML4 for a decade to get their jobs done, and of what drove people to use Flash where HTML couldn't fill the gaps.
There's a lot of blind optimism and scornful mockery surrounding the Semantic Web and I don't like any of it. Its proponents should recognize when they're asking people to do a lot of onerous work for little return. I've been a professional software developer for 5 years now and I have precious little interest in learning and using bulky standards like RDF or SPARQL Life is too short to worry about mega-standards that can let me express every possible fact about everything
But the Semantic Web's detractors, which include most of my colleagues and peers, should be able to get behind the basic idea that the Web should become more connected and discoverable over time. I wish they'd spend less time bashing on the way people are currently pursuing this vision, and more time working on a better way. Maybe that way looks a lot like Twitter's Annotations.
It's the developers that keep their eyes and minds on the shape of the future that get to determine it. I'd prefer if the ones who do are those who value simplicity and connectedness, and trust in other people to figure our way out of this mess of a Web together.
Twitter is going in the right direction but it seems they are reinventing the weel. This drives me crazy. Facebook is doing the same with OpenGraph. If anything I hope this will make people aware of Semantic Technology and eventually we'll all find our way back to the w3c standard. Triples FTW
Apparently my auto-complete likes to play games :p One more point: I agree that the cost of implementing RDF is too high for the current return. But IMO these are growing pains that developers have a responsibility to push through. I'm hoping that the barrier of entry will be reduced. With Ruby libraries like http://rdf.rubyforge.org/ generating the meta data should be much easier.
I'm with Brian on this one. I share a healthy skeptical attitude about semantic technologies (if anyone saw the SemWeb panel at SXSW this year, it was sad evidence that there is a place for skepticism) but I also heavily believe in the potential of this technology and spend a lot of time with it. I'm nervous about the tendency toward over-simplification. Simplify, yes. But big gains will come from big efforts, even while stop-gap measures are put in place until better solutions are constructed. (I'll save my rant about how Twitter Is one of these. :P ) It would be sad to see all of those potential gains disappear because of a desire to cater to lower-common-denominator developers. (I will quickly add a disclaimer here. I do NOT mean that developers that don't like the semantic web are lower-common-denominators. But there is a class of pseudo-developers emerging today. They are a Very important population to pay attention to and cater to. We've never had this before, and it will be incredible to see what they do as simple building blocks (like annotations) are put in their hands.) But lets not scale back ambitions in their names. Eventually, nobody (hardly anybody) should have to slave through RDF specs, et al. When good tools are created that abstract and automate the creation of powerful standards-encoded data, we can have the best of both worlds: simple and flexible use-cases and technologies, and the incredible potential that ontologies and inference-able linked data offer.
It's my hope and expectation that the makeup of developers starts tilting even more heavily towards "pseudo"-developers, and casual developers. I hope we start getting more 5th graders into the developer pool! :)
The web is only going to get ontologized and classified by the people who own each pieces of it, and as that becomes a broader and broader audience, you absolutely need to build a Semantic Web on top of technologies that are easy to adopt, and fun to use.
If this were something that only the archivists of the world needed to worry about, make it as robust and abstract and correct as you want. But you need to win over every developer and pseudo developer and non-profit webmaster out there. And unfortunately, the archivists of the world have come up with a system that only resonates with other archivist-minded people.
You can't take Twitter lightly: regular people use @replies and #hashtags like it's nothing, which is the sort of thing that I expect from IRC rooms. Instead of scoffing at it as a stopgap, you'd do better to use it as a model for how you get people to find making connections between data fun.
One note: "It's also a huge validation of their choice to switch to a schemaless database..." It's not a huge validation unless Twitter Annotations takes off. If it flops, then Annotations does not validate the switch to schemaless DB (I think the performance impact is that Twitter really wanted/needed, and Annotations just happens to work well with Cassandra. Also, you can do key/value pairs in a schema-full database, just FYI.)
Arbitrarily nested key-value pairs are less easy in a schema-ful database, as are key/values where you don't know the keys ahead of time.
I meant that it's a validation in the sense that it makes even launching Twitter Annotations much, much easier. Trying out a project like this with a schema'd database would likely mean a lot of serialization and deserialization work, which I doubt will fly on a site of their scale.
Schema for key/value pairs:
CREATE TABLE keyValuePairs ( key NVARCHAR(MAX) not null, value NVARCHAR(MAX) null )