Proposal for a Shared Technology InfrastructureAugust 25, 2009
Since we last met together in New York on April 23, the Foundation and its advisors have been thinking very hard about the future direction of what began as our pilot project initiative in conservation documentation in digital form. Many of these projects have exceeded original expectations, particularly for the quality and depth of inter-related art historical, technical, archival, and scientific content, such that they can no longer be considered solely in the realm of “conservation documentation.” We are now on the verge of transitioning what began as “the pilots” to a small but robust group of “prototypes” which we hope will serve the field internationally as the basis for new and expanding on-line interdisciplinary research.
As you know, among the Foundation‘s primary considerations for continuing funding for any projects involving software development is ensuring their sustainability. While the projects to date have been strong in terms of their various accomplishments and institutional commitment, their further development and future success demands that we carefully consider efforts to manage costs and leverage development resources. Writing and maintaining software code is not nor should not become the primary business of museums or art-history institutes. Consequently, when we meet in London on September 14, we would like to initiate a discussion about the ways in which we could collectively lead these projects to develop in a way that:
After careful consideration of a number of possible paths forward, we have concluded that we would very much like to see all of the projects brought onto a single, common technology foundation.
In order to frame the September discussion, it is important for us to provide in advance a potential framework for such a path that anticipates the most important questions, so that you and your colleagues can think hard about the plan and come prepared to discuss your reactions. We are extremely eager to discuss these ideas with you as well as its impact on your institutions and the expectations or concerns for dealing with a larger technology effort.
The most effective way to make each project as productive and sustainable as possible is to reduce the ongoing costs of the technology infrastructure required to support your work. The most effective way to do that is to build a single infrastructure that would support all of the current projects, as well as those that we hope will follow. In order to achieve the greatest possible degree of efficiency and economy, as well as to deliver the highest quality results, we propose to contract the software development work out, under your guidance and governance, to a group of exceptional technologists who are already working on museum projects. In effect, they would become your “vendor” for this project.
Another important purpose in spinning off the technology component is to ensure that project staff remain as free as possible to concentrate efforts on the scholarship going forward. As we looked at what would be required to continue individualized technical development for the various projects, it became clear that doing so would not only severely challenge the available resources of some of your institutions, but would also force several of your people into what amounts to two full-time jobs: scholar and technologist. A separate software initiative would still ensure that your technical people oversee the technical aspects of the project; in fact, their participation and governance would be essential. However, unlike the “every-project-for-itself” model, their participation in project-related technical activities would be manageable within the confines of a normal work-week.
We are persuaded that any other strategy (even a reduction to as few as two platforms) would lead to constantly escalating costs and would almost certainly hamper scholarly collaboration over time.
Together with other participants from the projects, and with the assistance of technical experts provided by the Research in Information Technology (RIT) program of the Mellon Foundation, the Trustees would work to finalize the specifications for a shared infrastructure and to negotiate delivery of the infrastructure with a chosen vendor. The Trustees would oversee design and development efforts at a high level (involving the individual projects for all testing and acceptance), would provide feedback to the vendor when and as any need for changes may arise, and would authorize payment to the vendor as the work is performed to their satisfaction.
Once the infrastructure software has been delivered and the terms of the technology grant have been fulfilled, the Trustees could disband—although, at the discretion of the institutions, the Board may choose to continue as a means of coordinating further collaborations.
We feel that the selection of members of the advisory group should favor practicing technologists rather than “academic” computer scientists (whose bias would inevitably be toward the new and unproven rather than toward reliability and performance). In particular, it would be useful to include software engineers from the CollectionSpace project, and those who are building RDF technologies and have the enterprise design and development experience that has yet to come into scope for the prototype projects. [Please see page 5 for a separate but related discussion of RDF technology].
At least one or two advisory board members should be selected from the user community; preferably from among management‐level personnel to help the entire advisory board understand better the real‐world problems, focused on cultural content, that this software development is attempting to solve.
Given the unmet needs that are specific to the research interests of these prototype projects, the advisory panel might well include expertise in managing images and image manipulation as well as in emerging technologies and community standards in online academic publishing.
At the same time, however, certain purely internal aspects of the projects must necessarily be reconciled: for instance, a means must be developed for the combined storage of data from multiple projects. These changes would be the responsibility of the Trustees, but they should not be noticeable to anyone other than the technologists working on the project.
User‐facing functionality changes faster than any other portion of a software project, far faster than back-end technology. Consequently, none of the projects will likely ever consider themselves “finished”, as users will be clamoring for new features almost immediately, leading to an iterative front-end development. What the sustaining infrastructure needs is not a single, ideal front‐end, but rather an infrastructure that makes the continual refinement and customization of front-ends to meet evolving scholarly needs as quick, powerful, and inexpensive as possible. Two projects funded by the Mellon Foundation RIT Program, Fluid and FluidEngage, offer that kind of infrastructure as well as a software design community already involved with museum-related projects that can assist in realizing particular design aspirations.
In a shared project infrastructure, the project teams and users would be able to agree– in all likelihood rather quickly and effortlessly – upon (a) the distinctively best features of each of the projects; and (b) the features that would prove problematic in another setting or with another target user population. Any consensus would then become the basis for common user interface elements, remembering that the community of end users is relatively focused in its interests and goals.
The labor pool, while still small compared to relational databases, is in absolute terms reasonably large and multinational; moreover, the National Gallery has a relationship with the University of Southampton, which is one of the international leaders in RDF technologies and a nexus for the sorts of skilled labor required to make the project successful.
2. The lack of widely accepted standards for search and retrieval still poses something of a risk.
The pace of improvement is rapid, and the costs of picking what turns out later to be the “wrong” approach can be mitigated substantially by careful system design.
3. Uncertainty has been expressed about scalability, suggesting that current RDF triple‐store technologies may not be able to handle the large number of triples that the prototype projects would generate. RDF is a prolix data standard which translates even comparatively simple relationships into a large number of “triples,” which must be stored, managed, and retrieved expeditiously if an RDF solution is to be workable.
This argument is largely out-dated as the technology is maturing rapidly, and growth in scalability has been particularly rapid over the last 2 years. Currently available triple‐stores, both commercial and open source, provide storage that is more than adequate to the needs of these projects at present, and the growth in performance is accelerating at a pace that should ensure that even massive scholarly adoption of the software produced by the prototype projects would not tax an RDF‐based infrastructure.