Page cover image

Design Guidelines

Use the following guidelines to design a POSIX file system that meets the requirements of your workload. The guidelines provide a framework to help you assess the storage requirements of your workload.

  • Evaluate the available storage options

  • Size your Sycomp Storage cluster.

Workload requirements

Identify the storage requirements of your high-performance workloads. Define the current requirements making sure you consider your growth requirements. Use the following questions as a starting point to identify the requirements of your workload:

  • How much storage capacity do you need today? In a year?

  • How many I/O operations per second (IOPS) or throughput in GBytes per second do you need? Do you need additional capacity to achieve the performance target?

  • Will data need to be moved between on-premises and the cloud?

  • Is there existing data in a Google Cloud Storage bucket that needs to be moved into the cluster?

  • Do you want to schedule data movement from one storage type to another? For example, migrate a file in a file system to a Google Cloud bucket.

  • Do you need persistent storage, scratch storage, or both?

  • Do you have a backup software vendor that you use?

  • Do you have a DR strategy?

  • How many application nodes need access to the data?

Storage Options

When deploying your cluster, you can choose to use either Google Persistent Disk or Local SSD.

Persistent Disk and Local SSD

You should use Persistent Disk for most deployments. There are options when selecting Google Persistent Disk. Use Local SSD for performance critical applications. Local SSD is ephemeral storage, so you get high performance though you need to use IBM Storage Scale replication for reliability.

Disk Type

Description

pd-standard

(HDD) is best for capacity

pd-balanced

Best for bandwidth

pd-ssd

Best for iops

pd-extreme

IBM Storage Scale optimizes the use of pd-balanced and pd-ssd such that pd-extreme is not commonly needed.

hyperdisk-balanced

Google's newest generation of network block storage

hyperdisk-throughput

Google's newest generation of network block storage

hyperdisk-extreme

Google's newest generation of network block storage

local-ssd

Best for IOPS and Bandwidth, though since it is not persistent it requires the use of Storage Scale replication.

Choosing the Right Storage

Sycomp recommends that you use Persistent Disk unless you need high-performance and density for ephemeral data. Local SSD can provide up to 9.3 GB/sec per VM or read throughput (as of February 2024) and 4.6 GB/sec write which becomes 2.3GB/ write with Storage Scale replication enabled (because the data is written in two places at once) whereas with Persistent Disk you can achieve high IOPS with a throughput of 4.8GB/sec per VM.

Persistent Disks are capable of a good number of IOPs and larger capacity than Local SSD and can be used on VMs that are shut down when not in use.

Networking

You can deploy your cluster with egress speeds of up to 200Gbps per VM. When selecting a network, it is best to match the NSD server networking with the storage performance (Persistent Disk Performance, Local SSD performance) available to the machine. For example, for Local SSD storage you can use a 75Gbit/s network interface to match the storage performance whereas with Persistent Disk SSD you can use a 32 Gbit/s egress option.

The network topology includes separate subnets depending on your needs.

With Storage Scale client nodes choose a network interface speed that matches your application requirements. You can do this because a Storage Scale client accesses data in parallel from all the NSD Servers so a client can utilize larger network bandwidth than a single storage server can provide.

The network topology that is used by Sycomp Storage includes a frontend network for client data traffic and a backend network for internal cluster traffic. The VPC is deployed within a single GCP project. Sycomp Storage can create the VPC or you can choose to use an existing VPC.

NFS Architecture Options

There are two ways to deploy NFS servers using Sycomp Storage. You can place the NFS servers on the same virtual machines (VM) as the NSD servers or you can place the NFS servers on separate VMs. When designing your architecture consider:

  • The type of storage you are using

  • The size of your deployment

  • The performance you need from NFS and the Storage Scale (NSD) client.

If the goal is to get the best IO throughput, which storage type you choose can determine the most cost-effective cluster architecture.

Single-Tier NFS

A single tier architecture is beneficial when you deploy small clusters, or the speed of the storage justifies deploying a high-performance machine type.

Example with NSD and NFS servers running on the same VMs with Google Local SSD storage.

Using a storage type of Local SSD a single Google VM can read up to 9GB/s which requires at least 75Gbit egress tier-1 networking to reach full performance. This type of networking comes in machine types with a large amount of memory and a large number of vCPU cores. In this case the NFS server can utilize the additional memory and cores, optimizing the deployment for cost performance. When running an NFS server on the same VM as an NSD server, some of the VM network bandwidth is consumed by NSD server traffic in addition to the NFS client traffic. This is fine if the resulting NFS client bandwidth meets your requirements, so consider this when choosing a machine and storage types.

Benefits of a single tier NFS server architecture.

  • Simplifies small deployments

  • NSD servers don’t require a large amount of memory or number of CPU cores so NFS service can utilize the extra CPU and memory available when high performance networking is used.

  • Good if the required number of NFS and NSD Servers are similar.

Two-Tier NFS

A two-tier NFS architecture is beneficial when you have large clusters of NSD servers and need fewer NFS servers.

Example using separate NFS servers VMs with Google Persistent Disk.

For example, a Google VM using Persistent Disk can achieve approximately 1.2 GB/sec (as of Feb. 2023) therefore, to read data at 300GB/sec at least 250 NSD server VMs are required. These VMs do not need to be large or have high egress rates, so an n2-standard-16 VM is sufficient to support that throughput. While a single NFS server VM with 75Gbit tier_1 egress can provide more than 9GB per second. In this case separating out the NFS servers VMs can help because you need far fewer NFS servers to reach your desired throughput.

Benefits of two tier NFS server architecture

  • Reduce compute costs with many NSD server VMs

  • Use high memory nodes for NFS servers to improve caching.

  • Scale the number of NFS servers independent of the number of NSD servers to optimize cost.

Should I use NFS or Storage Scale client?

You can use NFS or the Storage Scale NSD client to access the data in the Storage Scale file system. Which one is best depends on your use case.

Table 1: Compare when to use Storage Scale client vs NFS for data access.

Storage Scale Client

NFS

Client Throughput

Best

Good

Client IOPS

Best

Good

Access from non-cloud client

Good

Best

Multi-client concurrent file access

Best

Good

Performance

We ran NFS and NSD protocol read performance tests using the IOR benchmarking tool. We tested two cluster architectures one using Google persistent disk and the other with local SSD.

Test Cluster Configuration

Extreme Persistent Disk
Local SSD

Number of NSD Servers

128

38

Number of NFS Servers

64

47

Number of Test Clients

256

150

NSD Server Machine Type

n2-standard-64

n2-standard-80

NFS Server Machine Type

n2-standard-80

n2-standard-80

Client Machine Type

n2-standard-16

n2-standard-48

Results

Persistent Disk
Local SSD

NFS Read (MiB/sec)

480,000

328,976

NSD Read (MiB/sec)

519,589

333,150

Validating Your Design Using Sycomp RISE

It is often not possible to deploy your entire application to test a new environment. Sycomp RISE was developed to help you validate the IO performance of a new deployment for an existing application without having to deploy your entire application. You can use Sycomp RISE to test IO performance against a file namespace that is identical in structure and file size (not file data) to your environment. This allows you to test IO performance in a directory structure the same as your current application environment on a new deployment. This application specific validation of a new environment gives you a way to test before you migrate a production application, which can greatly reduce risk.

If you have questions about RISE contact Sycomp at scaleongcp@sycomp.com.

Last updated

Was this helpful?