Frequently Asked Questions (FAQs)¶
Do I really need a Supercomputer?¶
You probably do. Most organizations don't realize how much faster their time to insights and answers would be if they had access to a supercomputer. If your execution time is longer than you would like, or you're running out of memory (OOM), or you're running out of disk space, or you're running out of CPU, or you're running out of GPU, or you're running out of patience, then you probably need a supercomputer.
Though you might not need one for every one-off task, data pipelines, ML model training and inferencing, and other heavy computational tasks can benefit greatly from the power of a supercomputer. And with Eugo, you don't have to worry about managing the infrastructure, so you can focus on building.
How is Eugo so computationally fast?¶
A mix of distributed parallel compute, optimized software (more on this later), and cutting-edge hardware (GPU, SIMD, and AI accelerators). Basically, we squeeze every ounce of performance out of the hardware.
What's the difference between Eugo and Google Colab, AWS Sagemaker, AWS Glue, Azure ML, etc?¶
In most cases, a bicycle and a rocket will both get you from point A to point B, but one is going to get you there a lot faster. And in some cases, a bicycle won't get you there at all (like the moon). Same thing with Eugo: it's faster, more powerful, and more flexible. There's no limit to what you can do with it (like running nuclear fusion simulations) that otherwise wouldn't be possible with other tools. Additionally, there's no need to manage dependencies, worry about scaling (like picking the right sized instance or setting up a cluster), or running out of memory (like OOM errors). Eugo just works, so all you have to do is write code.
Does it actually work?¶
Yep, it does. We've spent years perfecting the foundation, and we even use Eugo to build Eugo. As someone close to the founders put it: “the feature is the scale.”
Will it work for my data and my use case?¶
Almost certainly. We provide hundreds of highly optimized libraries in C, C++, Rust, and Python, all out-of-the-box, and more are coming every day. Don't see what you need? You can bring your own version or ask us to incorporate one (we charge a fee).
Who needs this?¶
Anyone tired of waiting for code to run and building things that don't "make your beer taste better." If you've got a lot of data or run computationally intensive workloads, Eugo can make it faster.
Are you just a thin wrapper around Ray, Spark, OpenMP, or CUDA?¶
We use these libraries for distributed computing, but they're just one piece of the puzzle. Plus, you can use any of these libraries together on the same cluster at the same time.
Do you do professional services (proserv)?¶
Yes, we do. We can help you with everything from setting up your first cluster to optimizing your code. Simple tasks like setting up a cluster are free, but we charge for more complex tasks, like writing bespoke code, or migrating your existing workloads to Eugo.
Can you use Eugo on other cloud providers besides AWS?¶
It's possible, but it will require custom work. We're working on making it easier to use Eugo on other cloud providers.
Are you more energy efficient than traditional on-prem supercomputers?¶
Yes. Our clusters are serverless—they only exist when you need them (and you're only billed for what you use). And we're built entirely on Arm (aarch64), giving us a major sustainability boost.
Why does my notebook take almost 2 minutes to boot up?¶
The actual work of creating and starting a cluster is very fast (usually less than 5 seconds). The time you're seeing is the time it takes for the cluster itself to become healthy so that you can start running code. This time can vary depending on the size of the cluster and other factors outside of our control (like the speed of the underlying hardware).
How can I download my notebooks from Eugo?¶
You can download your notebooks in two ways:
- In your
workspaces
tab on the Eugo platform, click theDownload Notebooks
button next toOpen EugoIDE
. - In your
EugoIDE
, click theFile
menu in the top-left corner, thenDownload as
, and select the format you want to download.
How do we deal with bandwidth concerns vs traditional supercomputers?¶
Because on-premise superclusters are colocated, they have very low latency and high bandwidth. In most cases, Eugo clusters are distributed across data centers, and in High Availability (HA) scenarios, multiple availability zones, so there is some latency and bandwidth overhead. Here's how Eugo mitigates that:
TODO¶
- We offer the ability to run a Ray cluster inside a placement group so all nodes will be placed in the same physical rack.
- We enable NVIDIA GPU Direct and
NCCL
. Similar to EFA (Elastic Fabric Adapter),NCCL
allows GPUs on a single machine and across different machines to communicate directly, bypassing any CPU code. - We enable Jumbo frames by default when possible. This allows more than 1500 bytes of data to be transmitted by increasing the payload size per packet. This reduces the number of packets that need to be sent, reducing the overhead of the network protocol and increasing the throughput of the network.
- Each node gets an ENA (Elastic Network Adapter) that's not virtualized or semi-virtualized. This allows for higher throughput and lower latency.
- We are actively working on using Elastic Fabric Adapter (EFA) and libfabric (and on some instances, Intel Fabric adapters, Huawei, etc) for MPI communication on all of our nodes. EFA is a network interface for Amazon EC2 instances that enables you to run applications requiring high levels of inter-node communications at scale on AWS, so that some network calls from our apps can go to the network adapter and other instances, bypassing the CPU and Linux Kernel and removing excessive round trips.
What is EPM (Eugo Package Manager) for HPC?¶
Installing software packages for HPC is different than normally installing external packages. In virtually all cases, you need to build the software from source to get the best performance. EPM is a custom package manager for Eugo optimized for HPC applications. It allows you to install and manage software packages on your Eugo cluster when built from source. It's like apt-get
, dnf
, or pip
for Eugo. You can install packages that aren't currently included in the Eugo runtime, for example, C/C++ packages like libtool
, onnxruntime
, Rust packages like hyper
, or Python packages like thriftpy2
.
To use it, simply do the following. For this example, we will use aws-cdk-lib
as the package we want to install. It is a pure Python library with a Javascript bundle:
- Create a directory called
dependencies
in your workspace. - Create a directory with the name of the library in your
dependencies
directory (i.e.aws_cdk_lib
): - Notes:
- The name of the directory should be the name of the library you want to install.
- The name of the directory should be the same as the name of the library in the
meta.json
file (see more on this below). - The name of the directory should not have
-
(dashes) in it. If there is a-
in the name of the library, replace it with_
(underscores).- For example,
aws-cdk-lib
would beaws_cdk_lib
.
- For example,
- The name of the directory may include
.
(dots) in it. If there is a.
in the name of the library, just include it.- For example,
jaraco.classes
would just bejaraco.classes
.
- For example,
- Create a file called
meta.json
in your newly created directory (i.e.aws_cdk_lib
) and list the packages it depends on. Eugo will automatically install these packages for you when you start your cluster.
{
"kind": "python",
"name": "aws_cdk_lib",
"version": {
"kind": "pypi",
"should_auto_update": true,
"value": "2.185.0"
},
"dependencies": {
"runtime": {
"standalone": [
"python/aws_cdk.asset_awscli_v1",
"python/aws_cdk.asset_kubectl_v20",
"python/aws_cdk.asset_node_proxy_agent_v6",
"python/aws_cdk.aws_cdk.cloud_assembly_schema",
"python/cattrs",
"python/constructs",
"python/jsii",
"python/publication",
"python/typeguard"
]
}
},
"#comments": {
"requirements": "Requirements.txt",
"description": "This file is used to manage dependencies for aws_cdk_lib."
}
}
At this point, your example tree should look like this:
- For each dependency in
aws-cdk-lib
, perform Steps 2 & 3 above recursively. That is, create a directory in yourdependencies
directory with the name of the dependency (i.e.aws_cdk.asset_awscli_v1
), and create ameta.json
file with its dependencies listed.
Additional examples of advanced use cases will be added in the future (compilable C/C++ libraries, Rust libraries, Python libraries, etc).