Stefan's website
\( \renewcommand{\b}[1]{\symbf{#1}} \)

Numerical ocean modeling in the cloud

Riha, S.

1 Abstract

In the proposed study, we investigate the economic feasibility of providing cloud-based high-performance computing (HPC) services to the numerical ocean/weather modeling community. The unique selling point of the proposed services is to offer software consulting in combination with expert knowledge of oceanography/meteorology. In this proposal we focus on providing the service on top of infrastructure of mainstream public-cloud providers such as Amazon Web Services, Google Compute Engine or Microsoft Azure, to enable research consultancy companies of small- and medium size to take advantage of cost-effective "on-demand" HPC infrastructure.

Note: This document is structured as required by Förderrichtlinie "Modernitätsfonds" (Federal Ministry of Transport and Digital Infrastructure, 2016a), which also mandates the use of German language. The present document is written in English to reach a large audience. A German version may be submitted within the next weeks. The author is not a native English speaker and welcomes suggestions for improvement.

2 Management summary

Numerical ocean/weather modeling systems have been used for decades to enhance the value of observed oceanographic/meteorological data. They form an essential basis for decision making processes and logistics in the management and exploitation of natural resources, transport, defense, disaster mitigation and the tourism industry. Governmental agencies of several countries have recently launched initiatives to generate additional value from available geophysical data, by fostering commercial exploitation thereof ( Department of Commerce 2015; Federal Ministry of Transport and Digital Infrastructure 2016b). The commercial valorization of observed data crucially depends on the ability to refine existing data in a customized manner that addresses specific questions arising in the context of the commercial application. This can only be achieved through highly flexible numerical ocean/weather modeling, possibly provided as a commercial service.

To date, it is viable only for governmental agencies or large enterprises to commission oceanographic/meteorological modeling studies, which are typically conducted by academia and a small number of scientific consultant companies (e.g. Open Ocean ; StormGeo ; Oceanweather ; EXWEXS ). These theoretical studies often go hand in hand with a survey of new geophysical data, and in many cases, the contracting authority finances both. Surveying is extremely expensive, since it requires chartering research vessels and similar equipment with very high operating expenses. The costs of numerical modeling studies, on the other hand, are comparatively low, because the only required technical infrastructure is a computing environment. In principle, this enables small and medium sized private companies to independently provide research and consulting services in the form of so-called desktop studies.

A main obstacle for scientific consulting companies of small and medium size is the required up-front investment in high-performance computing (HPC) infrastructure. Of course, cost considerations are not unique to small and medium enterprises. In fact, there is a general trend for governmental agencies and universities to downscale their on-premise computing infrastructure and take advantage of cloud-based services, in the hope to reduce cost and increase productivity. While this tendency has been obvious for some time, particularly for basic mainstream services such as email and business applications, it is noticeable even in highly specialized areas such as HPC in general, and ocean/weather modeling in particular (Vance et al., 2016).

In the past decade, several companies have successfully developed a business model around services enabling high-performance computing on top of infrastructure of mainstream (public) cloud providers such as Amazon Web Services, Google Compute Engine or Microsoft Azure. Most of these companies attempt to accommodate HPC demands of all scientific fields and areas of engineering. However, the requirements for the computational infrastructure varies significantly between scientific disciplines. We argue that catering only a single discipline has the advantage of being able to focus software development on features that are most relevant in the particular field.

In the proposed study, we investigate in more detail if it is feasible to provide on-demand HPC services specifically to ocean/weather modelers, possibly in combination with scientific consulting. The service thus consists of two distinct modules.

The first module (Module 1) provides software consultancy services, specializing on automated hard- and software infrastructure provisioning for numerical modeling studies within a public cloud. Customers are governmental agencies, universities and private research and consulting companies. These customers employ scientists who operate the numerical model, and want to move their infrastructure to the cloud, in order to reduce costs and increase flexibility.

The second module offers scientific consulting for customers who do not employ scientific staff. The main service is the design of custom tailored numerical modeling studies, which specifically address the questions of the contracting entity. The goal is to provide answers within a shorter time frame, at a lower cost, and in a more flexible manner than studies conducted by universities. These product features are particularly important when serving agile companies of small or medium size, including start-ups. A key prerequisite for delivering this service successfully, is the access to a cloud computing infrastructure as planned in Module 1. Results are delivered in various forms, such as synthetically enriched and carefully analyzed geophysical data sets, printed or digital publications (optionally peer-reviewed), interactive web content, etc.

Potential project partners and stakeholders are universities, research institutions, public cloud providers, cloud-based HPC providers, engineering software companies, and finally software companies developing cloud computing platforms. In the initial stages of this project, we focus on (1) market research, (2) evaluating interest of potential stakeholders, and (3) designing a minimum viable product (within Module 1) for a specific combination of cloud provider and numerical ocean model.

3 Project objective

The overall objective of the project is to enable and facilitate the commercial valorization of existing geophysical data sets, in accordance with recent initiatives set out by various governmental agencies ( Department of Commerce 2015; Federal Ministry of Transport and Digital Infrastructure 2016b). The present proposal is based on the following premise:

The refinement of observed geophysical data through numerical ocean/weather modeling is a vitally important core requirement for the commercial exploitation of this data.

The reason is that the spatio-temporal resolution of both observed and synthetic data provided by governmental agencies is often too coarse for utilization in localized commercial applications. Examples for highly localized applications include prediction of oil spill pollution, studies of wastewater transport, highly accurate weather forecasts, etc. A logical follow-up question is therefore:

How can small and medium enterprises (in particular, startup companies) gain access to customized, high-resolution numerical ocean/weather simulations?

Today, this is difficult. Most scientific projects are conducted by universities or a small number of specialized private research and consulting service providers, which specifically target large corporations as their customers. Small and medium enterprises cannot afford to commission customized modeling studies. To remedy the situation, a new type of service provider must enter the market, which enables small and medium enterprises to contract (obtain) numerical ocean/weather modeling studies (results) satisfying the following requirements:

  1. The modeling study must be custom-tailored to the particular need of the contracting company. This may or may not require scientific consulting offered by the service provider.
  2. The study must be cost-effective.
  3. Results must be available quickly.

The proposed project will enable, or at least facilitate, the establishment of such flexible consulting services. However, the benefits of the project are not limited to fostering small and medium sized companies. Large research institutions and corporations benefit as well.

4 Business description

In the feasibility study we assess the profitability of providing the following two services, which may be offered in a modular manner and are henceforward referred to as Module 1 and Module 2.

4.1 Module 1: Software consulting

  • Product name: Numerical ocean/weather modeling platform as a service
  • Customer groups: Governmental agencies, universities, research and consulting companies (including small and medium size companies). Actual end users of the service are scientists or software engineers with strong scientific background, employed by these organizations.
  • Business drivers:
    • Desire to reduce on-premise infrastructure and associated costs
    • Avoiding delays/costs caused by job scheduling
    • Demand for infrastructure specifically optimized for numerical ocean/weather modeling
    • Enhancement of scientific progress by overcoming capacity limitations. While the size of a cluster in the public cloud may be limited (e.g. to 512 cores), the number of simultaneously running clusters is only limited by the cloud providers capacity.
  • Product details:
    • Production/customization of software that automates the provisioning of a cloud-based compute cluster optimized for numerical ocean/weather modeling. Automation includes:
      • Provisioning of compute, storage and networking infrastructure (”infrastructure as code”)
      • Possibly, deployment of the numerical model itself, and the software stack for pre- and post-processing of model data, visualization, etc.
    • Flexibility. Customers maintain 100% control over the provisioning process, and can modify it according to their needs. The academic/scientific environment highly values
      • Independence
      • Transparency
      • Reproducibility
      • Accessibility
    • Portability. The configuration tools are portable between cloud providers, to avoid ”provider lock-in”. The software works in combination with most public cloud providers and popular open-source software platforms for cloud computing (OpenStack, etc.)

4.2 Module 2: Scientific consulting

  • Product name: Numerical ocean/weather modeling and consulting
  • Customer groups: Enterprises and governmental agencies which require geophysical data enhanced by numerical ocean/weather modeling. These organizations may not employ scientific staff in the field of oceanography/meteorology.
  • Business drivers:
    • Demand for scientific consulting combined with custom-tailored, regional (localized), cost-effective and rapid numerical ocean/weather simulations by enterprises (including, but not limited to small and medium size enterprises) and governmental agencies.
    • All of the above listed for Module 1.
  • Product details: The main service is comprehensive oceanographic/meteorological scientific consulting. It can be viewed as an additional service to the Module 1, optionally including full management of the computational modeling platform.

Since this proposal is for a short-term (6 months) feasibility study, it focuses mainly on Module 1, whose implementation we regard as an enabling, necessary driver for Module 2.

5 Project partners and stakeholders

Potential project partners are

  • The scientific community
  • Technology companies. The potential role of these stakeholders requires a brief explanation.
    • Public cloud providers (e.g. AWS, Google Compute Cloud, Microsoft Azure). AWS, for example, sponsors development of a cluster provisioning tool for their infrastructure, called CfnCluster (Amazon Web Services, 2016). We assume that there may be interest in collaboration.
    • Cloud-based HPC providers, e.g. Cycle Computing and Alces Flight . Both companies provide ”add-on” services on top of public cloud providers. To our knowledge, Cycle Computing collaborates with various major cloud providers, while Alces Flight seems to focus exclusively on AWS. The relation of these companies to AWS's own cluster-provisioning software project (CfnCluster) remains unclear to the author. In any case, these two (and other similar) companies may be interested in participation in the present project, particularly if they see a benefit in extending their expertise in the area of meteorology/oceanography.
    • Engineering software companies (e.g. ANSYS, Inc. ). A major motivation for the present proposal is the presentation of Tambe and Kaiser (2015), which resulted from a collaboration between ANSYS, Inc. and AWS. Although ANSYS, Inc. serves the engineering sector, the technical challenges of transferring their products to the cloud are very similar to those we expect in the present project.
    • Software companies developing cloud computing platforms (e.g. OpenStack , which is developed by companies such as AT&T, Hewlett Packard, IBM, Intel, Rackspace , Red Hat , Suse , and many others. Their potential role is further explained in section 8 .

The objective of the proposed feasibility study is to assess potential roles of these stakeholders within the project, and evaluate the (dis-) advantages of involving them.

6 Industry background

6.1 Numerical ocean/weather models

Numerical ocean/weather models are mathematical ocean/weather models, formulated on discrete computational domains according to the fundamental laws of physics ( Haidvogel and Beckmann 1999; Kantha and Clayson 2000). When a model is run on a computer, prescribed initial- and boundary conditions (temperature, salinity, flow velocity, etc.) are processed, and new (”synthetic”) data is generated. To link this terminology with the context of Förderrichtlinie ”Modernitätsfonds”, one may think of the prescribed data as observed data, and the synthetic data as refined data. The amount of observed data fed into a model depends on the complexity of the model. In highly idealized models, observed input data can consist of little more than a single number, for example the average observed density difference between two fluids in different geographical locations (e.g. Riha and Peliz 2013). More realistic models incorporate a variety of observed data, such as wind stress, radiative heating, freshwater/salinity changes caused by evaporation and precipitation, freshwater input from rivers, etc. In the most realistic models used for forecasting, real-time observed data is continuously assimilated into the model (Chassignet and Verron, 2006). The predictive skill of ocean modeling systems heavily depends on the quality and density of the observed data fed into the model. Note that in ocean models used for biological or biogeochemical studies, additional (biological or biogeochemical) data is necessary to drive the respective sub-model.

6.2 Computational infrastructure for numerical ocean/weather modeling

Numerical ocean/weather modeling requires large computational resources. Simulations are typically performed in distributed-memory computing environments, consisting of many connected individual computers or nodes. In each node, one or more CPUs access internal memory which is inaccessible to all other nodes. The numerical grid carrying the model variables is partitioned into blocks, and each node performs a computation on a single block. We refer to the connected system as cluster. Synchronization and communication of individual processes in a cluster is usually achieved by using a standardized communication protocol (MPI, see Lusk et al., 2009). Increasing the spatio-temporal resolution of a numerical model by a factor of two, typically increases the computational cost of running the model by a factor of about 8. Sufficiently high resolution is desirable in many numerical modeling applications, but bounded in practice by the capacity of the available computing infrastructure.

The acquisition, maintenance and management of a cluster is expensive. Universities and governmental agencies sometimes under-provision their IT departments with high-performance computing (HPC) infrastructure for economic reasons. That is, the maximum throughput of the acquired HPC infrastructure cannot meet the effective demand during peak utilization. If the number of computing tasks submitted from individual scientists/projects exceeds the capacity of the infrastructure, the tasks are simply ”queued” (i.e. they have to wait in line) until previously submitted tasks are completed. This is referred to as job scheduling or batch queuing. While it allows the respective institution to provision according to the average workload instead of the peak workload, it necessarily causes waiting times for individual projects. To date, this is simply accepted as an inconvenient consequence of limited on-premise computing resources.

6.3 Cloud-based HPC

Several companies provide access to cloud-based, on-demand supercomputing for fields such as genomics, machine learning, simulation, and scientific computations as a service. Here we specifically consider a business model which aims to facilitate delivery of cloud-based HPC solutions in the form of a paid ”add-on” service on top of general services provided by the mainstream cloud providers such as Amazon Web Services (AWS), Microsoft Azure and Google Compute Engine (see Fig. 1).

Network topology
Figure 1: Sketch of a network topology for cloud-based scientific computing. The scientist’s personal workstation is connected to a cloud-based compute cluster via a virtual private network (VPN). The cluster is configured by a service provider, according to the scientist’s needs.

Established companies which use this model are, for example, Cycle Computing and Alces Flight . The former company has been cited by media as a ”truly disruptive” innovator, creating a market that never existed before, by providing HPC compute power to small companies/departments or researchers that until recently have never had access to supercomputers (McKendrick, 2016).

As mentioned above, numerical ocean/weather modeling requires large computational resources. To give an example, the DKRZ in Germany is a national service provider operating a supercomputer to enable numerical ocean/weather/climate simulations. Their current (2016) supercomputer features a total of 100.000 processor cores with InfiniBand®-connected compute nodes. Compared to this standard, the maximum capacity of cloud clusters assembled from resources offered by mainstream cloud providers is modest. To give a concrete example from the related field of engineering, Tambe and Kaiser (2015) report that a computational fluid dynamics solver of the ANSYS® software suite deployed on Amazon Web Services scales very well to 256 cores, and on larger problems even up to 512 cores, using 16 to 32 compute nodes interconnected with a 10 Gbit/s network, each of which has 16 cores and a minimum of 60GB of memory. This comparatively modest maximum (efficient) cluster size is partly due to an inferior networking bandwidth of 10 Gbit/s network offered by most cloud providers, as compared to recent InfiniBand® specifications featuring a bandwidth on the order 100 Gbit/s or more. Another performance limiting factor is hardware virtualization, which is employed by most mainstream public cloud providers. The essential point made by Tambe and Kaiser (2015), however, is that the currently available capacity of mainstream cloud infrastructure is perfectly sufficient for a large number practical applications. This finding is of central importance for the current proposal.

Very recently, Vance et al. (2016) described how cloud-based HPC is increasingly used in numerical ocean/weather modeling. They provide an introduction to the use of cloud computing in the atmospheric and oceanographic sciences, with an emphasis on scientific applications and examples rather than on the infrastructure of cloud computing.

7 Action plan

7.1 Market research

  1. Assess the demand for cloud-based ocean/weather modeling. Due to the strong influence of scientific staff on decision making processes in universities and research institutions, it is sensible to clearly distinguish the motivation for investment into economic reasons on the one hand, and scientific incentives on the other hand.
    • The economic advantages of cloud computing have been extensively discussed in other areas of computing. How do they apply in the specific context of numerical ocean/weather modeling?
    • How does science benefit? It has been argued that the possibility to perform a parameter sensitivity study (ensemble experiment) on 100 different 512-core clusters simultaneously may fundamentally change the way in which ocean, weather and climate research is conducted. Is this reasonable? Vance et al. (2016) discuss the advantages of cloud computing from a scientific perspective.
    • Several topics within market research overlap with engineering issues. Typically, the most important concern of scientists is how large (computationally intensive) numerical problems can become, under the condition that cloud-based cluster computing remains efficient.
  2. Regarding the symbiosis/competition with established companies (e.g. Cycle Computing ; Alces Flight ), assess whether it is viable to focus on the ”niche” market of numerical ocean/weather modeling. How large is this market segment? What is the achievable market share within this segment?

7.2 Engineering and science

During the 6 month period, we focus on defining and creating a minimum viable product by restricting attention to

  • a single cloud provider. We choose Amazon Web Services, which we assume to have the biggest market share and representative technology. Limiting our attention to a single cloud provider allows us to use ”proprietary” (in the sense of ”non-portable”) helper tools to simplify the provisioning process. Although this necessarily leads to non-portable code, it provides a proof of concept and yields a concrete hardware/networking topology pattern, which can later be adapted to other cloud providers.
  • a single numerical ocean model. We use the Regional Ocean Modeling System (ROMS) ( Shchepetkin and McWilliams 2005; Haidvogel et al. 2008; Shchepetkin and McWilliams 2009), due to its focus on regional-scale applications widespread adoption by the international ocean modeling community permissive free software license (Open Source Initiative, 2016) comparatively well-maintained and publicly accessible discussion forum and documentation ( The ROMS/TOMS Group 2016a; The ROMS/TOMS Group 2016b).

The main engineering tasks in this feasibility study are as follows:

  1. Configure a test suite for benchmarking computational efficiency and scalability. The test suite will answer the question of how large (computationally intensive) numerical problems can become, under the condition that cloud-based cluster computing remains efficient. Is the 512-core limit accurate? The ROMS source code includes pre-configured benchmarking tests, which may be adapted to address questions specifically arising in the context of cloud computing (virtualization, networking, I/O, etc.). These modifications will incorporate opinions and previous experiences of members of the ocean modeling community, in particular the ROMS user community. In the course of this process, the role of the modeling community as a stakeholder in the project will become apparent. The range of involvement by members of the community may range from sporadic feedback on the one end of the spectrum, to continuous active involvement in discussions and/or contributions of code on the other end of the spectrum.
  2. Produce software code representing ”infrastructure as code”, meeting the specific demands of numerical ocean/weather modeling. Provide the following functionality:
    • Automated provisioning of compute, storage and networking infrastructure of the cluster
    • Automated deployment/configuration of the scientific software stack, including pre- and post-processing software
    • In the early stage of this project we use Amazon’s CfnCluster (Amazon Web Services, 2016) tool for cluster provisioning, which facilitates proof of concept deployments and which can quickly be extended to support different clustered applications. A particularly attractive feature is the pre-configured integration of third-party scheduler systems (also referred to as batch-queuing systems, such as Sun Grid Engine, OpenLava, etc.) into Amazon’s cloud infrastructure, which automatically allocates compute nodes as a function of the number of pending jobs in the scheduler.
  3. A major task (and challenge) is to determine a suitable work-flow for cloud-based numerical ocean/weather modeling studies. Since the scientists workstation is connected to the cluster via the internet (Fig. 1), the user experience may suffer if bandwidth is low and/or latency is high. Application design must be guided by the objective to (1) minimize transferral of large amounts of data, and (2) avoid unnecessary user interaction which could be perceived as unresponsive due to high latency. In this area the proposed project may greatly benefit from collaboration with established service providers in similar fields, such as ANSYS, Inc .
  4. For our particular cloud provider, what is the right combination of instance types (CPU and memory) and networking infrastructure for numerical ocean/weather modeling?

8 Valorization strategy

8.1 Module 1

A long term objective is to sell expertise in setting up infrastructure for numerical ocean/weather modeling. End users are either scientists or software engineers with strong scientific background. It is therefore necessary to gain a reputation within the scientific community. In this early stage of development, we use our preliminary results to

  • exchange opinions and ideas with the scientific community.

The obtained results are not only valuable for end users, but also for technology companies such as cloud-based HPC providers (Cycle Computing; Alces Flight) and cloud providers (Amazon, Google, Microsoft, etc.). We use the obtained results to

  • exchange opinions and ideas with cloud providers in general, and cloud-based HPC providers in particular.

In this proposal we focus on offering services on top of public cloud providers, which enables small- and medium sized companies to perform HPC modeling studies without having to invest in on-premise infrastructure. However, we point out that a large part of the obtained results is equally relevant for private clouds. This is significant, given that large research facilities will likely continue to operate private on-premise clouds, be it for economic reasons or for reasons of data protection. Hence, it may be profitable to develop HPC services on top of open-source cloud software platforms such as OpenStack. To further evaluate this idea, we use our results to

  • exchange opinions and ideas with developers of open-source cloud software platforms .

8.2 Module 2

Innovative oceanographic/meteorological scientific consulting services result from technological advances obtained through Module 1. Cloud-based modeling studies are less expensive, more flexible and provide results more quickly.

9 Benefits to society

Numerical ocean/weather studies have been conducted by governmental agencies for decades, hence their benefit to society needs not to be elaborated here. More recent efforts to generate additional value from available geophysical data, by fostering commercial exploitation thereof, will certainly be beneficial to the economy, provided that this effort targets the creation of a self-sustaining economic ecosystem. We believe that our project is a vitally important contribution to this end. We envision a future in which cost effective, flexible and rapid meteorological/oceanographic scientific consulting services are accessible for small- and medium-sized companies as well as large corporations and governmental agencies. The study proposed here is a logical precursor to accelerate development in this direction.

10 Legal issues regarding the use of geophysical data

The services provided by Module 1 do not raise legal issues concerning ownership, dissemination or use of geophysical data. The service merely provides the technical infrastructure to manipulate data. It is the customer's responsibility to comply with existing licensing agreements attached to input data, and formulate appropriate licensing agreements for (refined) output data.

In Module 2, consultancy services may include the use of licensed geophysical data. The legal right of use must be defined in detail before the data is accessed. To this end, all stakeholders draft (and agree upon) a legally effective contract before the start of a project.

11 Use and enhancement of existing data sets

This study does not require surveying of new geophysical data. Existing data may be used as input data, for testing purposes of the numerical modeling platform. In principle, modeling results may be stored for future use. However, it should be clear that the objective of this project is not directly the refinement of existing data, but the creation of the technical infrastructure that allows such refinements to be done in a more cost- and time-efficient way.

12 Financial plan

The table in Fig. 2 lists the estimated expenses of the proposed feasibility study for a total duration of 6 months. The plan includes the following costs

  • Purchase of relevant scientific publications ( Vance et al. 2016, etc.). A printer, ink, paper to print publications.
  • Installation and operation of an internet connection with a minimum symmetrical bandwidth of 2MBit/s (down- and uploads). Estimates are based on information provided by T-Systems DSL Business (2016). This will be necessary for testing.
  • Costs of Amazon Web Services, necessary for testing.
  • Travel allowance. The applicant plans to attend at least one meeting or conference.
  • Cost of labor (of the applicant). The estimate is based on a monthly net salary of €2426.59 for a full-time postdoctoral position in Germany with more than 3 years of experience (TV-L E13, Stufe 3) including 15.5% health care insurance cost, based on Tarifvertrag für den Öffentlichen Dienst der Länder 2015a, Tarifgebiet West.
Financial plan
Figure 2: Estimated expenses over a period of 6 months. Support intensity and effective funding amount are based on a funding intensity of 70% for small enterprises.

References