What is a Snapshot?
As a team begins transitioning from traditional cloud hosting environments, there will inevitably be a time when when someone wants to “snapshot” a system. A snapshot preserves the state and data of a virtual machine at a specific point in time. They are used to restore that virtual machine to a that saved state.
- The state includes the virtual machine’s power state (for example, powered-on, powered-off, suspended).
- The data includes all of the files that make up the virtual machine. This includes disks, memory, and other devices, such as virtual network interface cards.
- A Few Snapshot Limitations (according to VMWare)
Snapshots have a number of inherit disadvantages that makes them difficult to manage:
- Large numbers of snapshots are difficult to manage and track
- Consume large amounts of disk space
- Not protected in the case of hardware failure
- Can negatively affect performance
VMWare itself states “Do not run production virtual machines from snapshots on a permanent basis.”
Nonetheless, there are specific times when a snapshot can be used to test specific configurations and quickly revert to a saved state if unexpected results occur. Snapshots can also make it easier to duplicate a VM from another system, or reverse-engineer an unfamiliar VM into familiar assets and environments. Snapshots are always only a short-term solution due to their administrative headaches, inflexibility, high capacity usage and vulnerability to outages and disruption.
Enabling the Snapshots
Only a Team Manager can enable Snapshots for their team. If you’re not certain what your role is within a team, or don’t seem to have sufficient permissions, contact your Team Manager. If you’re not sure who your Team Manager is, contact support.
To enable/disable Snapshot functionality for a team:
- Navigate to your Team landing page using the site logo on the top left of the page
- Use the main navigation bar on the left to click Settings at the bottom of the bar
- Under the General tab find Virtual Host Snapshot and check/uncheck the Enabled box
- Click Save Changes
Creating a Snapshot
Users are limited to a single snapshot for each host. Snapshotting the same host again will overwrite the previous snapshot. To create a snapshot:
- On the main navigation bar on the left click Hosts under the “Compute” header
- Click the Run in question
- On the actions bar, click the Snapshot dropdown and choose New
Note: A snapshot will capture all running processes, logs, commands, etc. It’s considered a best practice to shut down your VM before attempting to take a snapshot.
Depending on the complexity and size of your environment, it may take between 5-10 minutes to fully capture the snapshot, and you may see a yellow banner message stating The host is currently in an unknown state. Some functionality may be limited. This banner will stay up until the snapshot has been completed.
Restoring from a Snapshot
Once a snapshot has been captured, you can restore your host to the snapshot at any time.
To restore from a snapshot:
- On the main navigation bar on the left click Hosts under the “Compute” header
- Click the Run in question
- On the Snapshot dropdown, click the Restore button
The Restore button will not be visible unless a snapshot has already been taken.
Note: If there is a cloud side failure (disk, compute, network) during a restore, a system may end up in an unrecoverable state.
Replace Snapshot
You can replace a snapshot with a new one anytime.
To replace a snapshot:
- On the main navigation bar on the left click Hosts under the “Compute” header
- Click the run
- On the Snapshot dropdown, click the Replace button
The Replace button will not be visible unless a snapshot has already been taken.
Delete Snapshot
To delete a snapshot:
- On the main navigation bar on the left click Hosts under the “Compute” header
- Click the run
- On the Snapshot dropdown, click the Delete button
Recommendations for Usage
Bottom Line - Arcus offers a better approach! …to the traditional notion of snapshots. It is a case of “how” vs. “what”:
- WHAT - Save the state of a system at a particular point in time
- HOW - Use VMware/AWS/Azure provided “snapshot” capability
- (BETTER) HOW - Use Arcus assets
Arcus provides a better HOW option to the WHAT. It makes use of assets and modular designs to save the state of a system at a particular point in time. This design approach has many benefits, including:
- Usability. Easier and quicker to update components or try out different configurations
- Performance. Snapshot performance degrades over time.
- Scalability. No need to carry around and update entire “catalogs” of monolithic VMs
- Transparency. Know exactly how a system is built
- Portability. Easily deploy systems across different cloud providers
- Stability. Systems driven from automation
- Only option for Physical Systems
- Better Options for Long Term Backups
- Better Options for High Availability / Disaster Recovery
Furthermore, using Arcus allows a user to automate the deployment of:
- multiple copies of a systems so they are disposable (i.e. launch three, use one, throw it away, use the next one)
- scenarios consisting of many systems to create complex itegration environments or ranges
Note: VMware has issued guidance that attempting to snapshot Windows Server 2019 can cause errors, and may only capture file structure, and not application data. Further data on this is located here.
Snapshots and Team Limits
We have always had a cautious approach to snapshots. We resisted enabling them for a long time for several reasons, but users really wanted them. Since we added support for snapshots a couple of years ago, we have absorbed the associated storage costs as part of the core package. However, snapshot usage has exploded. There are some points during a given month where snapshots consume up to 30%+ of our available storage pool. This usage, coupled with increased costs from the storage hardware and software vendors, has necessitated that we adjust we how track storage usage. While there is no price change at this time, effective 15 DEC 2023, snapshot usage will be counted against a Team’s storage limits. This will be reflected in the total storage consumed, and in the logic that allows new deployments to proceed. For now we are using the same standard across all three cloud providers (AWS, Azure, VMware) where the snapshot size is equal to the original disk(s) size. We are working to see if we can get more precise data out of VMware, but so far that info is limited.
Note that if either of these changes results in your Team being over its storage limits, no action is required - no overage charge to be paid nor a requirement to cut usage; that is on us. However, you would need to remove some storage or add more capacity if you are over the limit before you can deploy new systems, modify existing ones, or take new snapshots. Additional storage capacity can be acquired if necessary.
More Help
Review this topic with our video tutorials: