cyberpunkture

thoughts on technology, culture and the future

Science and Research; Then and Now

There’s been a lot of talk about the need for disruption in the way that science and research are done. Traditionally, a small group of people (or even just one person) would work for a while on something on their own and then publish it in a conference or journal when it was somehow ‘complete’. This publishing was done mostly on paper.

The modern era has brought a huge amount of technology that could improve this process, but in practice all it seems to have done is make it possible to download a PDF rather than having to find the actual physical paper. In fact, crappy policies by conferences, journals and professional organizations have actually made even this advance inaccessible in many cases. (Matt Welsh has a great blog post about Research Without Walls which calls on researchers to not agree to submit or review work that will not be publicly available online.)

Further pointing to the fact that we’re not taking enough advantage of technology is the success of the polymath projects in leveraging a distributed and open group of people to solve hard math problems by letting them easily collaborate on ideas for making progress. Certainly, this shows we can do better. People are solving hard math problems with global-scale collaboration and we’re still having arguments about whether it’s OK for organizations to be able to hide publicly-funded research behind paywalls.

Disruption

Michael Nielsen has a great piece in the Wall Street Journal (which I think should be open to anyone) that goes into some details and also describes some of the hurdles that such a transition might face. The core message is that the incentive system we have now is broken and doesn’t encourage people to share their results, their data or generally collaborate very well. Instead, all it incentivizes people to do is to produce publications which in turn help their reputation and likelihood of getting funding.

His point seems right on, but managed to come across with a tone that makes me want to dig in deeper and ignore his advice. Sentences like “There are other ways in which scientists are still backward in using online tools,” simply make me want to scream. It’s not like scientists are actively trying to hide in a hole. We’re doing the best that we can with limited time to figure things out and responding to the reward structures we have.

Worse still, his tone is easy to pick up on and blogs that focus less on research are able to nearly parrot it back. This GigaOM post is a good example. It winds up pulling the same strings in me as the calls for the general public and congress to review what science is done in this country.

That being said, I agree. We’ve made crap for progress even in my field of computer science. I think that we do better than most with there being viable venues for people to present their work every 6 months or so (more often in areas like databases where VLDB now has monthly submission deadlines). Further, it winds up being something like 6 months between submission and presentation. That means even if it’s accepted right off the bat, it can take a year from ‘finishing’ something until it’s presented.

At the same time we still struggle with actually ‘publishing’ results in the good sense with data and open access to the papers. USENIX is amazing and lets you freely distribute your work, posts it online for free and even posts the videos of presentations online for free. Other organizations—ahem, ACM and IEEE, ahem—have been less forward-thinking.

That’s just dealing with bringing the old model of publication into the new area where faster publication schedules and wider dissemination are possible. It doesn’t do anything to address new forms of collaboration.

Resolution?

Sadly, while Nielsen does go on to explain some easy fixes that will at least aim to provide open access to data and papers, namely mandating it as part of grant approval, he offers very little to address getting to the real-time, cross-group collaboration which he starts talking about with the polymath project.

Really, all he has to offer is this:

Grant agencies also should do more to encourage scientists to submit new kinds of evidence of their impact in their fields—not just papers!—as part of their applications for funding.

The scientific community itself needs to have an energetic, ongoing conversation about the value of these new tools. We have to overthrow the idea that it’s a diversion from “real” work when scientists conduct high-quality research in the open. Publicly funded science should be open science.

I think it misses the broader issue which is that it’s hard to be a scientist or researcher today and the result is that we instinctively cling to any edge we have. The consistent cutting of higher education’s budgets and similar cuts in industry, Intel radically cut it’s collection of research labs in the last 2 years, have left far fewer true research positions available. When resources are scarce, it’s hard to convince people to share.

Sometimes we don’t share data because we think we can benefit from it if we hoard it. Other times we don’t collaborate because we worry about how credit will be divvied up. It’s entirely possible that these are actually non-issues. In fact, I think this is likely the case. Nonetheless, right now it’s dangerous to step out into this world since we don’t know the answers.

However, I think most of the time we don’t collaborate or share, it’s not out of malice or selfishness, but rather that it’s extra effort to share. I think that Nielsen underestimates the difficulty in releasing curated code or data sets. It’s not just a matter of posting a file to a web server. Even more so, researchers see little or no immediate benefit from such sharing.

Perhaps a first step would be creating a way for researchers to share some details of what they’re doing in exchange for people providing feedback and suggestions publicly. This model is already used to some degree when grad students and faculty give talks about their work in progress to closed audiences. Even so, I think it’s done too little and too late.

My Conclusions

In the end, I think that the goal of open access seems like something that’s relatively easy to obtain and will at least modernize the traditional publishing mode. The more ambitious goal of broad collaboration to make progress more quickly is tantalizing, but I think merely yelling at scientist to believe in it and do it is the wrong way to get there.

http://www.ted.com/talks/roger_mcnamee_six_ways_to_save_the_internet.html

At 15 minutes long, the talk is just worth going to watch all the way through. He makes 6 key points:

  1. Windows (really Desktop) is dead
  2. Index Search (Google) is dead (instead Wikipedia, Twitter, Facebook, LinkedIn, etc.)
  3. Apps will beat the Web
  4. HTML5 changes everything (really end of Google commoditizing all content)
  5. Tablets will win big (iPad has no valid competitors)
  6. Social is a sideshow (Facebook won)

I think point 1 is pretty obvious. Most people will not use a desktop for most of their time on a computer.

Point 2 is interesting and the justification is that search makes up much less of interaction when people use mobile devices and then referring to point 1. Instead people increasingly use apps which use a combination of domain knowledge and context—Twitter, Wikipedia, Facebook, Twitter, etc.—to do better.

Points 3 and 4 were muddled, but basically amount to saying that people will start to really differentiate content using HTML5 apps to become something more than “just the result of a Google search.”

Point 5 is something that I’m not sure I agree with. It’s entirely possible that we’ll pick up screens that look like tablets while we’re on the couch, but I think we’re rapidly moving to a world where we carry around a phone-like device which projects itself onto the other devices that happen to be near us at the time—including, possibly, tablet-like screens. In that world, the iPad isn’t really relevant. It’s just a screen.

Point 6 seems spot on. Facebook more or less won and covers enough of the space where social really matters that it’s hard for me to imagine the parts of my life that still need to be social-ized.

http://www.newscientist.com/blogs/onepercent/2011/11/electronic-contact-lens-displa.html

Bioengineers have placed the first contact lenses containing electronic displays into the eyes of rabbits as a first step on the way to proving they are safe for humans. The bunnies suffered no ill effects, the researchers say.

I’m still curious about how useful it is to have pixels on something plastered to your eyeball. Is there any way to actually make it so that you can usefully focus on something that close? Or even if you can, do so at the same time as focusing on the rest of the world to allow for actual augmented reality?

The longer range question all of that prompts is: Are the problems with getting a physical display to actually work right and overlay on top of the real world for somebody likely to be solved before we have a good enough understanding of the optic nerve to be able to just put the information there?

Normally, I’d say the answer was obviously that we don’t know enough about how to read and write data from and to the optic nerve, but this makes you think maybe it’s not as far off as you’d think:

http://gizmodo.com/5843117/scientists-reconstruct-video-clips-from-brain-activity

And the video: http://www.youtube.com/watch?v=nsjDnYxJ0bo

Still, very fascinating stuff.

Wired is claiming that the Stuxnet virus/malware/whatever is actually targeting particular pieces of industrial equipment that is likely to be used for uranium enrichment in Iran and then interfering with it slowly over the course of weeks.

http://www.wired.com/threatlevel/2010/11/stuxnet-clues/

I’m always wary of claims that computer viruses are targeting things other than just your data and possibly your computer’s ability to send spam, but this seems like it might be the real deal. An actual, state-created piece of malware aimed at trying to directly interfere with a real-world process. I’m both stunned and a little scared. I’d be curious to know if it actually had any effect.

I know almost nothing about SCADA systems in general, and absolutely none about the ones in question, but I’ve heard from a variety of people that I trust that they are quite vulnerable and lag something like 10-15 years behind where we are in computer security in general. I guess this is the right place to look if you wanted to hit something.

I’m still interested to see how this pans out when more determined journalists get a hold of it and start placing it into context rather than tech news reporting on an anti-virus company’s press release.

http://www.scientificamerican.com/blog/post.cfm?id=im-not-a-real-scientist-and-thats-o-2010-11-12

I just found this “invited blog” on Scientific American’s website and I thought it was interesting for a couple of reasons, though in the end I think it’s of more use for the discussion it starts than for what it says.

Basically, it argues that computer science should be part of philosophy because it argues logic forward without any real experiments, repeatability or the like. Unfortunately the conclusions seem to drawn from some really, really crappy examples.

I agree that computer science is not really pure, or even applied, mathematics, it’s not really engineering because we think very differently than engineers do and it’s certainly not a traditional science with hypotheses and the like. I might even grant that the formal logic from philosophy translates well into a lot of computer science, but to say that we don’t do experiments to confirm things is a bit wrong.

As a PhD student in systems and networks the only thing that matters in the end is the experiment and even when I make systems that claim they are “easier to use”, I’ve run studies with users to show how easy they were to use. My friends that work on usable computing and HCI all run studies to show that what they’re doing is actually better. My friends in AI/machine learning run experiments to see how well their algorithms work in extracting knowledge from the web.

In fact, the only people I know who are computer science PhD students whose work doesn’t revolve around the experiment are the theoreticians who really are mathematicians by most accounts.

Admittedly, our experiments are still somewhat crude and we haven’t mastered exactly how to do repeatable experiments every time and our measure of statistical significance is often “this line is higher than that line”, but nonetheless we are much more hypothesis-test driven than the blog makes it out.

I was doing my usual reading through the news thing when I stumbled across an opinion piece by the ex Director of National Intelligence Mike McConnell about how we should be preparing the nation’s cyber-defense strategy.

The piece is mostly a fluff-filled call to arms saying that we are woefully behind, but there’s no real reason for it and that really what we need is just the resolve to sit down and draw up some concrete plans and strategy for what it is that we’re going to do. I agree with most of that, but then I stumbled across this gem:

More specifically, we need to reengineer the Internet to make attribution, geolocation, intelligence analysis and impact assessment — who did it, from where, why and what was the result — more manageable.

This really perplexes me, because two paragraphs earlier he was talking about how Hilary Clinton was extolling the virtues of the Internet as a tool for free speech and democracy. Suddenly, when the U.S. needs to defend itself, we need exactly the tools that would make a repressive country best able to shut of the benefits of the Internet as a platform for expression.

It has just further convinced me that by keeping the current group of military and intelligence officials in charge of this, we will constantly be behind in the Internet-age.

Update: (3/2/2010) Wired wrote a story commenting on the same article pulling out the exact same sentence from McConnell’s op-ed. Good to know that I’m not the only one catching these things. They point out that McConnell has been fear mongering about this stuff in order to get bigger U.S. intelligence access to the Internet for years.

I spent the last 3 days using the Google Chrome beta for Mac OS X and I just switched back to Firefox, but I figured I’d catalog what I liked and didn’t like about it and what actually made me switch back. The most compelling aspects of Chrome are the technical back-end features where it isolates tabs from each other and provides a mechanism to track resource usage to a given tab.

In windows, I understand that this all functions pretty well, but in OS X things feel a lot less polished. The tab task manager doesn’t exist and a bunch of the other features that any real web browser needs to have are still unimplemented.

  • Bookmarks Manager: This is the most glaring omission. It means you can’t delete, rename or move bookmarks around though you can add them. Oddly, when I imported my keyword search bookmarks from Firefox they work in Chrome which is nice, but I doubt I can add a new one.
  • Cookie Manger: This is admittedly only a minor annoyance as I think you can nuke all the cookies, but it’s still nice to occasionally go look and see what’s there.
  • Certificate Manager: This means you can’t permanently confirm “security exceptions” and have to go through a big ugly red screen for each such exception you want to put in each time you run Chrome.

Then there are a series things which I’ve come to love in Firefox and expect of current web browsers that are missing.

  • Open All in Tabs: I use this to open all my web comics at once and scan through them. Also for some blogs and other things. It’s just something small that I rely on daily.
  • Vertical Three-finger Scrolling: This is the smallest annoyance, but in Firefox three-finger swipes up and down take you to the top and bottom of the page which is a useful shortcut on my MacBook.

All in all, I’m not overly disappointed with Chrome, but not wildly excited either. In the end if it manged to match Firefox more or less in feature set, then I’d probably pick it for the better and more secure back-end tab isolation.

Last, but not least, there’s the whole lack of extensions which they just fixed for windows and it seems like they’re going to finish that for the Mac soon. I won’t judge it too harshly for that, but rediscovering how many ads the web has on it was less than pleasant.

Apparently ICANN has largely solved a problem with “domain tasting” using relatively straightforward economic means, which is cool and I wish we could see more systemic problems on the Internet approached this way. For those who don’t know (and I was one of these people), here’s what domain tasting is:

The move was intended to stop “domain tasting,” where someone registers a raft of domain names and then monitors those domains for up to five days to see which domains attract a lot of visitors. If the domain looks like a loser, a person could get a refund within five days, called the Add Grace Period.

The grace period is intended to allow people to be refunded, for example, if they made a spelling mistake while registering a domain. But many specialize in abusing the grace period by setting up thousands of Web sites crammed with advertising links on newly registered domains. If the advertising revenue exceeded the registration fees, the domain would be kept.

Pretty evil. Anyway, they’ve started to only refund part of the cost for the domains which are released after the grace period. At first they just kept $0.20 per domain, which didn’t have much effect, but more recently they’ve increased it to $6.75 per domain. The results are apparently impressive:

In a report, ICANN said Add Grace Period deletions for registries that have implemented the policy have dropped 99.7 percent between June 2008 to April 2009.

Cool beans! Now we just need to get other smart people thinking about where we can leverage simple economic ideas like this rather than spinning our wheels fighting technical battles.

http://news.bbc.co.uk/2/hi/technology/8193951.stm

Augmented reality has always been one of those things that I knew was going to happen and when it did, it would be big. Recently, I’ve instead been really wondering if it’s going to be big or if it’s just going to sneak up on us and suddenly be the way we all do things. The first video in that BBC link is actually pretty good at explaining the idea while keeping it grounded enough to understand what the first apps are likely to look like.

As it is, more than half my friends (as well as me) walk around with location-aware phones that give them location-specific information when they ask for it. For the most part it’s limited to finding things which are near you like gas stations, bus stops, food, driving directions, etc., but there’s no reason for that.

The only thing which seems to be missing is figuring out which direction the phone is pointed (being solved by digital compasses in many phones) and possibly image recognition. Microsoft’s project Natal makes me think that people are going to at least have the balls to try and tackles some of the real-time image recognition stuff.

I’m not quite sure where it’s going, but it’s going fast and I’m excited even if it seems like my Palm Pre may not be the first platform for this stuff. I still can’t believe they re-made the same “all apps are just going to be web pages” mistake that Apple did when they first introduced the iPhone.

http://www.shapeways.com/themes/stainless_steel_3dprinting_gallery

Friends of mine in UW CSE and at the Intel Research lab in Seattle have been 3D printing stuff out of plastic for a while now and I’m always amazed, but it’s mostly prototypes even if they are useful. If you can actually get people to make you stainless steel printed things, that’s just fucking awesome.

The prices aren’t dirt cheap but you seem to be able to get reasonably complex things for less than $100 and the price is likely to come down with time. Maybe we really are going to live in an era where we can flaunt at least some things in the face of economies of scale.

It vaguely reminds me of some comments I overheard about how we’re in the pre-industrial age of software design where each piece of code is really artisanal. That implies that the next step is to move toward industrial production of code, but maybe that’s not the case. Maybe what cheap, on-demand 3D printing is saying that some things can really always stay artisanal without too much cost.