ICSOC Day 1 Keynote – Services for Science

The 6th International Conference on Service Oriented Computing is on in Sydney this week. NICTA is a sponsor, and I managed to score a registration to attend. Ian Foster opened with an interesting keynote. (Preceded by a 30 minute delay fussing with Mac technology issues!) He spoke on “Services for Science” – how SOA is being used to support knowledge creation in science. Currently there’s a surprisingly strong growth of online services providing data and analysis, in astronomy and especially in the biomedical field. He talked about the caGrid network. Ontologies are key there for meta-data of experimental results – Ian commented that the community is very “neat” (not scruffy) in being explicit and standardised in the representation and organisation of their data.
It’s interesting that for representing scientific workflow they’ve dropped BPEL in favour of the workflow notation and supporting infrastructure in Taverna. The workflows are used not only to coordinate data and analyses, but also to communicate methods and in principle to promote reuse. But the caGrid leaders recognise that it’s hard to design for workflow reuse, and hard to achieve reuse in practice. Ian also discussed experimental use of functional programming techniques to support provenance – to capture computations as a first class entity for scientific audit, review, and mining. He finished with some discussion of scalability and text mining of research publications.
I think there are interesting analogues of some of the issues now being explored in the e-science domain that have already been thrashed out in software engineering. They are quite similar in some ways – in the two fields of practice at an industrial scale, there are teams of knowledge workers working on complex and partly-shared electronic assets. Large scale reuse and variation has been made methodical in Software Product Line Engineering, and provenance issues are very similar to those that are well known in the established discipline of (Software) Configuration Management.