Summary of Slingshot's replication features in EVO
EVO's Slingshot automation engine has been expanded to include replication functionality. A new type of job can be created specifically for backup and other redundancy goals where complex directory structures need to be handled.
EVO's replication jobs are flexible: The source files can be located on EVO, but it is not required. In fact, a job can be created to handle files from a source outside of EVO to a destination outside of EVO. Jobs can also be created to handle file replication completely internally to EVO.
Sources and destinations can be shares hosted by EVO, USB drives hosted by EVO, remote SMB storage or an AWS S3 bucket.
Replication jobs are intended to be master-slave.
Differences between sync and copy
Each replication job can be set to either "Sync" mode or "Copy" mode. These options let you choose how replication jobs will react to files deleted at the source. When you delete files locally the replication job will do one of two things: either it will ignore deleted files (keeping them on the destination) or it will notice when a file has been deleted and make the destination match.
Use "Sync" when the destination should represent the current state of the source. Any changes you make to the source directory or share will be carried over. New files will be replicated, files that were moved will only appear in the new path and files that are deleted from the source will be removed from the destination as well.
Use "Copy" (default) when the destination should retain everything in the state it was last sent. New files will be added just like in "Sync" mode, but when a file is deleted from the source it will be kept on the destination. Files that are moved or renamed will be copied to the destination again using the new name and path, and the original file name and path will also be retained on the destination. This can result in multiple duplicate copies on the destination if files are frequently moved at the source.
Choosing the path for your destination and why it might matter
As you can imagine, Sync mode and Copy mode each have their own benefits and possible drawbacks.
As with any backup plan, it's important to review the data at the destination to make sure it's what is expected to be there before and after running a job. It's important to be aware of what is on the destination before running a job, and you should consider creating a subdirectory to use as a destination for replication jobs whenever possible, rather than using the root of a share. This is especially important if you intend to use "Sync" mode. "Sync" mode will make the destination match the source, which means that if a destination is selected that already contains data that is not present on the source it could be removed. Again... "Sync" mode will delete all existing data from the destination that is not on the source! Unless you're using an empty volume as the destination, this can be very destructive. Creating a new directory as the destination will ensure that this doesn't create a problem. Replication jobs will not interact with anything "above" the destination path.
Use a common time server for sources and destinations
Replication jobs evaluate what data should be synchronized or copied based on a number of factors, including the modification time and the file size. It's important to ensure that the reference clock on EVO matches the workstations and any other server being used as a source or destination. The easiest way to keep everything synchronized is to have all systems configured to use the same NTP (time) server. Also be sure to check that the time zone matches on each system.
Set the destination to be Read Only for everything but the sending system
If a file is modified at the source it will need to be updated on the destination. That means that an existing file on the destination will need to be overwritten. Before doing that, the replication job will check to make sure that the file being overwritten isn't actually newer than the file at the source. If the destination file is newer than the source, this creates a conflict in which there's no way to know which is the "correct" file to preserve. If this situation occurs, the replication job will skip overwriting the destination to avoid data loss. A user must manually resolve this conflict, but it's not always obvious that a conflict has occurred unless the transfer logs are being monitored. One way to avoid this situation is to only grant write permission on the destination to the EVO handling replication jobs. In this configuration, other users can safely read files on the destination, but modification of the files at the destination is then prevented outside of administrator maintenance.
Consider how to handle your Recycle Bin when using Sync mode
If you're synchronizing a source share from an EVO that has its recycle bin enabled, consider whether the contents of that recycle bin should be replicated to the destination or not. For example if your destination is metered, you may want to set your job to not include the recycle bin's contents.
In general we suggest not including the recycle bin in the replication, but there are some situations where it may be desired.
Here are some recommendations based on typical cases:
|Syncing EVO to S3||Enable recycle bin on EVO (source), do not include the recycle bin in the replication job.||S3 is metered storage, so including the recycle bin in the replication job will result in higher data usage. It may also be difficult to interact with the recycle bin via S3.|
|Syncing EVO to another EVO||Enable recycle bin on the source EVO, do not include the recycle bin in the replication job, do not enable recycle bin on the destination EVO.||If both EVO systems are being used in production, system resources including network and disk should be available to editors as much as possible. Replicating the recycle bin would consume more of those resources for files that are likely not needed (because they were deleted by someone).|
|Syncing EVO to EVO Nearline||Disable recycle bin on the source EVO, do not include the recycle bin in the replication job, enable recycle bin on the destination EVO.||This configuration allows users to free up space on their tier 1 EVO storage more quickly. Deleted files will be handled by the recycle bin on the Nearline. Restoring files in this configuration will require using the Nearline's trash browser and manually copying the file back to the original share.|
|Syncing EVO to a local non-EVO storage||Enable recycle bin on the EVO (source), include the recycle bin in the replication job.||This option highlights where recycle bin redundancy can be useful. This configuration extends some of EVO's recycle bin functionality outside of EVO and results in an additional measure of data protection.|
Verify data integrity
Features like replication are convenient automatic processes that reduce the need for regular human intervention in normal system operations. However, from time to time it's good check in on the process to make sure everything is working as expected. This will be unique for every situation, but periodic verification should include at least the following regular tasks:
- Check the Replication job summary file for any errors or warnings that might indicate trouble
- Check that the directory structure matches expectations
- Check that modification times for recently edited files are correct
It's also recommended that file integrity be confirmed from time to time via a full hash of all files. Depending on the features available, this may require all files be "read" by the system performing the verification.
If there are any changes to network connectivity or server credentials, make sure to re-verify that replication jobs can complete successfully. If a job fails, the summary or detailed logs can assist in determining what went wrong.
Be aware of file attributes
Replication jobs can preserve ShareBrowser metadata (tags and comments) by retaining that information in an internal database.
However, it is not currently possible to preserve extended metadata (i.e. extended attributes/xattr)—such as Finder-level colors and Finder-level tags associated with files and folders—in the replication process. This is a common consideration when moving data from one file system to another. If preservation of extended metadata is important, it will be necessary to add files to a compressed archive that preserves this information while the file is still on the source. To test, make a replication job that sends a sample file to the destination and a second replication job that pulls it back to the original source (in a different directory) or another location where the file can be analyzed. It should be possible to review the resulting file to verify if significant attributes have been retained.
As always, please contact SNS support with any questions or if there's any trouble tracking down an issue with running jobs.