Parsing Thermo Finnigan RAW files

1 Comment »

In a rare move, I’m going to largely copy across a post from my work blog, because I hope it contains useful information. For background, I’m trying to write a simple python script that extracts particular metadata from a .RAW file, produced by a Thermo Finnigan mass spectrometer. Tools that exist for parsing these files require access to proprietary XCalibur libraries, which I do not have.

Thermo provided a link to MSFileReader, a ‘freeware’ COM object that should allow interaction with RAW files without an XCalibur installation. They also sent a PDF guide to the COM object. Although this will allow XCalibur to be avoided, the work is still Windows-bound.

Python and COM objects

Python can talk to COM objects, through the win32com.client package. As a test, I installed Python and MSFileReader and the pywin32 libs on my netbook (which is a Windows 7 machine). Can import the required Python module, but need to extent the PATH somewhat:

  1. >>> sys.path.append('C:\\Python26\\Lib\\site-packages\\win32')
  2. >>> sys.path.append('C:\\Python26\\Lib\\site-packages\\win32\\lib')
  3. >>> from win32com.client import Dispatch
  4. >>> x = Dispatch("NAME")

The key thing here is “NAME”:

The provided PDF gives C snippets for each method available in the COM object. This only provides one clue as to the possible name of the COM object

  1. // example for Open
  2. TCHAR* szPathName[] = _T(“c:\\xcalibur\\examples\\data\\steroids15.raw);
  3. long nRet = XRawfileCtrl.Open( szPathName );
  4. if( nRet != 0 ) {
  5.     ::MessageBox( NULL, _T(“Error opening file”), _T(“Error”), MB_OK );
  6.     …
  7. }

XRawfileCtrl is used to call the Open() method. However, this and MSFileReader as “NAME” both fail (Invalid class string).

Found ‘multiplierz‘ which seems to use MSFileReader to create mzAPI – which focusses on access to the actual data, rather than the metadata. The code gives some good clues as to how to use the COM object. [doi:10.1186/1471-2105-10-364]

MSFileReader.XRawfile is used as “NAME” in this code.

So:

  1. >>> sys.path.append('C:\\Python26\\Lib\\site-packages\\win32')
  2. >>> sys.path.append('C:\\Python26\\Lib\\site-packages\\win32\\lib')
  3. >>> from win32com.client import Dispatch
  4. >>> x = Dispatch("MSFileReader.XRawfile")
  5. >>> x.Open("C:\\Users\\path\\to\\file\\msfile.RAW")
  6. >>>

To be continued…


Telomerase – make your skin immortal!

1 Comment »
I know that the beauty industry has made a habit of twisting science somewhat for it’s own ends (see this and this for instance), but this one takes the biscuit.
The wife spotted a piece in Harper’s Bazaar while she was in the hairdressers yesterday, about an amazing new beauty treatment (the article itself is hard to link to, but it’s number 3 in the list of “9 Skin Secrets for Spring“). Injections of telomerase for $1,500 a pop. Apparently it ‘stimulates resting stem cells’. Obviously the Harper’s piece has guff about it being Nobel-prize winning technology.
Telomerase is an enzyme that amplifies DNA repeats at the ends of chromosomes, without this activity, the telomeres would get progressively shorter until the “Hayflick limit” is reached and the cell will stop dividing, or undergo programmed cell death (there’s a reasonable review of the role of telomerase here: http://www.jco.ascopubs.org/cgi/content/full/18/13/2626).
Now I’m no expert, but as far as I know, telomerase is turned off in normal somatic cells, and telomerase activity has been associated with up to 90% of cancers (even its Wikipedia entry will tell me this much, a rather old paper with some concrete figures can be found here: http://dx.doi.org/10.1016/S0959-8049(97)00062-2). I’m not suggesting for a second that injecting telomerase will give you cancer (the overwhelming probability is it will do nothing at all), but this seems to be an amazing example of abusing science in the name of ‘beauty’.

Stack Overflow, BioStar and – shock, horror – some code

No Comments »

Stack Overflow reached critical mass a few months ago now. The site gets upwards of 6 million unique visitors a month. The chances are, if you write code, you know it exists, and you’ve received some sort of help there, whether directly or by proxy. In its wake, Stack Overflow has spawned Stack Exchange, a question and answer platform that anyone can buy into, and set up a site. So there’s now Stack Overflow type sites for any number of topics.

Amongst these sites, there have been a couple of attempts to set up science Stack Exchanges (http://asksci.com/ & http://science.stackexchange.com/), but to my mind, science as a whole is not specific enough, even biology is probably too broad an area for Stack Exchange to work as a platform. As a result questions are too hand-wavy, and communities have not really seemed to build. The key to Stack Overflow’s success is that it has very tightly defined boundaries, only questions about programming are accepted, anything else is removed for being off-topic. The site’s creators even set up more sites, Super User, and Server Fault, to keep Stack Overflow on topic.

It seems, then, that bioinformatics is the perfect use case for Stack Exchange. A more narrow domain than science or biology, with an already web savvy community ready to coalesce around a useful focal point. Until recently, however, no one had made the site. Then a couple of weeks ago, http://biostar.stackexchange.com/ started to get some attention on Twitter and FriendFeed. It’s early days, but the site has made a good start, some interesting questions, with some good, intelligent, answers. My main concerns would be:

  • Not enough users, no critical mass achieved.
    • The site seems to be gaining some traction, and getting more active by the day.
  • Not enough questions, no reason for users to keep coming back.
    • This does remain an area for concern, but is also starting to pick up a little.
  • No financial backing, the site may disappear after the test period comes to an end.
    • Stack Exchange is not a cheap platform, and a site like this will need funding to continue. However, the administrator, Istvan Albert, has insisted on the Google Group, set up for ‘meta’ discussion surrounding the site, that the site is funded for a year at least.
Finally, the real motivation for this post… A question on BioStar made me revisit some semi-abandoned code in order to post an answer, and I thought it was quite a nice snippet. About 15 lines of Python that utilises the UniProt ID Mapping service to automate protein ID conversion. I’ve stuck the code into a gist, and I thought I’d stick it up here too.

Posted via email from Simon’s posterous


Impact factors, Colossus and the Wakefield retraction.

No Comments »
(Graphs from The Independent (London), 21 June 2008)

Today was one of those days where lots of interesting stuff turns up. On the BBC, there was 2 very good pieces about the flaws in the scientific process, specifically closed peer review and impact factors.

I also notice that the BBC are running a daily piece about the history of computing in the UK this week, parts one and two have already been published. Today’s article about Colossus is especially good.

Also, after last week’s excellent, and damning, judgement from the GMC -

- regarding Andrew Wakefield’s reprehensible behaviour in his research into the ‘link’ between MMR and autism, today The Lancet finally pulled the paper in which his findings were published 12 years ago. Wakefield et al (1998) (doi:10.1016/S0140-6736(97)11096-0) has now been retracted from the public record after the Lancet concluded that the claims made by the researchers were ‘false’ (http://www.thelancet.com/journals/lancet/article/PIIS0140-6736%2810%2960175-7/fulltext – apologies for paywall).

Posted via email from Simon’s posterous


Wildlife photographer stripped of award

1 Comment »

The winner of the Wildlife Photographer of the Year award has been disqualified after judges ruled that the featured wolf was probably a “model”.

It is tough to imagine how you would go about getting this photo of a wild animal. The story given by the photographer on winning the prize was reasonably convincing, however.

Posted via web from Simon’s posterous