arrow-left
All pages
gitbookPowered by GitBook
1 of 1

Loading...

Design Guidelines

Use the following guidelines to design a POSIX file system that meets the requirements of your workload. The guidelines provide a framework to help you assess the storage requirements of your workload.

  • Evaluate the available storage options

  • Size your Sycomp Storage cluster.

hashtag
Workload requirements

Identify the storage requirements of your high-performance workloads. Define the current requirements making sure you consider your growth requirements. Use the following questions as a starting point to identify the requirements of your workload:

  • How much storage capacity do you need today? In a year?

  • How many I/O operations per second (IOPS) or throughput in GBytes per second do you need? Do you need additional capacity to achieve the performance target?

  • Will data need to be moved between on-premises and the cloud?

  • Is there existing data in a Google Cloud Storage bucket that needs to be moved into the cluster?

hashtag
Storage Options

When deploying your cluster, you can choose to use either Google Persistent Disk or Local SSD.

hashtag
Persistent Disk and Local SSD

You should use Persistent Disk for most deployments. There are options when selecting . Use Local SSD for performance critical applications. Local SSD is ephemeral storage, so you get high performance though you need to use IBM Storage Scale replication for reliability.

hashtag
Choosing the Right Storage

Sycomp recommends that you use Persistent Disk unless you need high-performance and density for ephemeral data. Local SSD can provide up to 19 GB/sec per VM or read throughput and 19 GB/sec write which becomes 8GB/ write with Storage Scale replication enabled (because the data is written in two places at once) whereas with Persistent Disk you can achieve high IOPS with a throughput of 19GB/sec per VM.

Persistent Disks are capable of a good number of IOPs and larger capacity than Local SSD and can be used on VMs that are shut down when not in use.

hashtag
Networking

You can deploy your cluster with egress speeds of up to 200Gbps per VM. When selecting a network, it is best to match the NSD server networking with the storage performance (, ) available to the machine. For example, for Local SSD storage you can use a 75Gbit/s network interface to match the storage performance whereas with Persistent Disk SSD you can use a 32 Gbit/s egress option.

With Storage Scale client nodes choose a network interface speed that matches your application requirements. You can do this because a Storage Scale client accesses data in parallel from all the NSD Servers so a client can utilize larger network bandwidth than a single storage server can provide.

The network topology that is used by Sycomp Storage includes a frontend network for client data traffic and a backend network for internal cluster traffic. The VPC is deployed within a single GCP project. Sycomp Storage can create the VPC or you can choose to use an existing VPC.

hashtag
NFS Architecture Options

There are two ways to deploy NFS servers using Sycomp Storage. You can place the NFS servers on the same virtual machines (VM) as the NSD servers or you can place the NFS servers on separate VMs. When designing your architecture consider:

  • The type of storage you are using

  • The size of your deployment

  • The performance you need from NFS and the Storage Scale (NSD) client.

If the goal is to get the best IO throughput, which storage type you choose can determine the most cost-effective cluster architecture.

hashtag
Single-Tier NFS

A single tier architecture is beneficial when you deploy small clusters, or the speed of the storage justifies deploying a high-performance machine type.

Using a storage type of Local SSD a single Google VM can read up to 9GB/s which requires at least 75Gbit egress tier-1 networking to reach full performance. This type of networking comes in machine types with a large amount of memory and a large number of vCPU cores. In this case the NFS server can utilize the additional memory and cores, optimizing the deployment for cost performance. When running an NFS server on the same VM as an NSD server, some of the VM network bandwidth is consumed by NSD server traffic in addition to the NFS client traffic. This is fine if the resulting NFS client bandwidth meets your requirements, so consider this when choosing a machine and storage types.

Benefits of a single tier NFS server architecture.

  • Simplifies small deployments

  • NSD servers don’t require a large amount of memory or number of CPU cores so NFS service can utilize the extra CPU and memory available when high performance networking is used.

  • Good if the required number of NFS and NSD Servers are similar.

hashtag
Two-Tier NFS

A two-tier NFS architecture is beneficial when you have large clusters of NSD servers and need fewer NFS servers.

For example, a Google VM using Persistent Disk can achieve approximately 1.2 GB/sec (as of Feb. 2023) therefore, to read data at 300GB/sec at least 250 NSD server VMs are required. These VMs do not need to be large or have high egress rates, so an n2-standard-16 VM is sufficient to support that throughput. While a single NFS server VM with 75Gbit tier_1 egress can provide more than 9GB per second. In this case separating out the NFS servers VMs can help because you need far fewer NFS servers to reach your desired throughput.

Benefits of two tier NFS server architecture

  • Reduce compute costs with many NSD server VMs

  • Use high memory nodes for NFS servers to improve caching.

  • Scale the number of NFS servers independent of the number of NSD servers to optimize cost.

hashtag
Should I use NFS or Storage Scale client?

You can use NFS or the Storage Scale NSD client to access the data in the Storage Scale file system. Which one is best depends on your use case.

Table 1: Compare when to use Storage Scale client vs NFS for data access.

hashtag
Performance

We ran NFS and NSD protocol read performance tests using the IOR benchmarking tool. We tested two cluster architectures one using Google persistent disk and the other with local SSD.

hashtag
Test Cluster Configuration

Persistent Disk
Local SSD

hashtag
Results

Persistent Disk
Local SSD

hashtag
Validating Your Design Using Sycomp RISE

It is often not possible to deploy your entire application to test a new environment. Sycomp RISE was developed to help you validate the IO performance of a new deployment for an existing application without having to deploy your entire application. You can use Sycomp RISE to test IO performance against a file namespace that is identical in structure and file size (not file data) to your environment. This allows you to test IO performance in a directory structure the same as your current application environment on a new deployment. This application specific validation of a new environment gives you a way to test before you migrate a production application, which can greatly reduce risk.

If you have questions about RISE contact Sycomp at scaleongcp@sycomp.com.

  • Do you want to schedule data movement from one storage type to another? For example, migrate a file in a file system to a Google Cloud bucket.

  • Do you need persistent storage, scratch storage, or both?

  • Do you have a backup software vendor that you use?

  • Do you have a DR strategy?

  • How many application nodes need access to the data?

  • Google's newest generation of network block storage

    hyperdisk-extreme

    Google's newest generation of network block storage

    local-ssd

    Best for IOPS and Bandwidth, though since it is not persistent it requires the use of Storage Scale replication.

    Titanium SSDs

    Google stotrage optimized instances provide local ssd performance with greater capacity and improved durability.

    Good

    n2-standard-64

    n2-standard-80

    NFS Server Machine Type

    n2-standard-80

    n2-standard-80

    Client Machine Type

    n2-standard-16

    n2-standard-48

    Disk Type

    Description

    pd-standard

    (HDD) is best for capacity

    pd-balanced

    Best for bandwidth

    pd-ssd

    Best for iops

    pd-extreme

    IBM Storage Scale optimizes the use of pd-balanced and pd-ssd such that pd-extreme is not commonly needed.

    hyperdisk-balanced

    Google's newest generation of network block storage

    Storage Scale Client

    NFS

    Client Throughput

    Best

    Good

    Client IOPS

    Best

    Good

    Access from non-cloud client

    Good

    Best

    Multi-client concurrent file access

    Number of NSD Servers

    128

    38

    Number of NFS Servers

    64

    47

    Number of Test Clients

    256

    150

    NFS Read (MiB/sec)

    480,000

    328,976

    NSD Read (MiB/sec)

    519,589

    333,150

    Google Persistent Diskarrow-up-right
    Persistent Disk Performancearrow-up-right
    Local SSD performancearrow-up-right
    The network topology includes separate subnets depending on your needs.
    Example with NSD and NFS servers running on the same VMs with Google Local SSD storage.
    Example using separate NFS servers VMs with Google Persistent Disk.

    hyperdisk-throughput

    Best

    NSD Server Machine Type