Silicon Valley startups might not think it's a big deal, but being able to run entirely on a private network (either "behind a firewall", or an entirely disconnected network) is pretty huge. Without AeroFS, your choices today kind of suck, especially for 10-500 person companies (or bigger companies where your corporate option sucks or isn't available). Dropbox doesn't work if you care about security. You're left with various forms of SMB crap, more backend-type things like iSCSI, or either blasts from the past (nfs, afs) or science projects (zfs).
I don't know where you get the idea that there are not good choices for on-premise, file share and sync solutions. There are products like filecloud (http://www.getfilecloud.com) which have been solving this pain very nicely.
On iOS, even AeroFS kind of sucks, due to some horrible decisions Apple made. Basically every app has to adopt every storage provider's API, and right now, only Dropbox and maybe Box have any takeup. (and iCloud, of course, but iCloud sucks a lot, isn't self-hostable or even enterprise hostable, and is horrible.)
I've been struggling to find a good Dropbox alternative for on premise. Aerofs is missing the option to share a folder/file through a link to non-aerofs-users which is vital to our workflow.
This filecloud is interesting, thanks. Any other product that you are aware of? I thought I had found them all, but you never know :)
Filosync (my product) has share via link to non-users (users who don't have the client app installed). http://www.filosync.com
(I'm the guy who wrote Arq, the Mac backup app, too).
If you are willing to trust your privacy to strong crypto (something you probably would want to use even on a private network), ObjectiveFS (https://objectivefs.com) might be worth looking into.
You mean (for an AWS S3 backed system): trust your privacy to an application binary (or, potentially, source which might get audited periodically, and that the toolchain and binary correspond to that...), plus to any OS or application bugs at other layers...
Being able to work on a totally disconnected network is different. Maybe if you could support Eucalyptus's S3 or another backing store like that it would be equivalent (and that might be interesting).
I'm sure plenty of people are fine with an Internet-connected, AWS S3 backed system, though, especially with additional crypto.
Yeah. We do a lot of security work and have clients inside the Federal government. We run Github inside our firewall. Something like this would be pretty rad, I just emailed it to one of our infrastructure guys.
I'm also trying to get it looked at by some federal customers; if you run into any problems, let me know (I've gone through the DOD ATO process for a Linux-based appliance; civ agencies aren't usually as weird but...).
So I'm actually setting up a private cloud for my company right now. It's a company of 1, so the requirements are quite extensive. :-P
I'm building a FreeNAS server with 6x3TB hard drives in a RAID-Z2 config. My goal is to allow my Mac to use it for Time Machine backups, but to also use AeroFS as my file sync mechanism when I'm both in the office and on the road.
Hopefully it works out smoothly. I'll have to figure out how to access the machine from behind my router, and I'll have to determine how to get it to automatically back up to S3 + Glacier. I think there's going to be a lot of details that I'll have to research here.
>My goal is to allow my Mac to use it for Time Machine backups
Note that this is completely unreliable. Time Machine isn't built to work over the network. The network support is badly hacked together and will destroy your backups on network faults. They basically treat the network volume as a block device that gets corrupted when the connection fails. Using it over wifi makes this painfully obvious.
I'm positive about Time Machine over AFP. iSCSI was indeed pointed out as a potential avenue with less issues, but if I recall didn't really fix it completely. If you're running wired+iSCSI you're probably much less prone to this. Try and yank some network cables during backup and see how it handles it.
I ran into more issues than most because the particular AP I have at home doesn't work well with the macbook air I was trying to backup and resulted in a lot of disconnects. So it was a bit of a stress test but the issue is real and well documented. You can find instructions all over the web on how to fsck sparsebundles to try and get back your backups[1].
This is particularly annoying because Time Machine is supposed to be a user friendly version of versioned backups like you can do with rsync+hardlinks. I have a script to do that for me on my personal linux machine and would love to be able to deploy TimeMachine+AFP for the rest of the family. Unfortunately apple decided it really needed to have hard-linked directories for just this case and so decided that the way to do that over the network was to have a single binary blob with a HFS image on the server instead of the proper files in the filesystem (filesystems don't usually support hard linked directories, HFS was hacked to do it). That way you not only lose direct access to the individual files on the server you also lose the whole backups when there's disk corruption...
I should probably have a look again to see if iSCSI is a viable solution to this.
I can confirm that. I've lost a TM backup recently made over the (wired) network. The cable failed, backup in progress hung, TM volume remained locked until I reboot, I rebooted and sparsebundle got screwed up. TM started it all over again.
> (sigh, it seems the tech community has become blind to anything that isn't heavily blogvertised.)
There's so much released and so much noise, it's hard to keep track of all the good things that come out - especially if they're not mentioned within their tribe of friends/followed ppl.
I don't know how to get round that - the web is huge and no one can keep track (or have a network that touches) all of the open source releases.
Unison is no longer under active development, right? Would love to use it but going forward, is there a big enough community to fix bugs, add new features etc?
It has not been a problem in my experience. There isn't major development (proper support for inotify and flexible topology would be nice), but Unison does what it says extremely solidly. There's certainly an active user community and the occasional bugfix/feature. Reread that notice again concentrating on the good parts, and take a look at the changelog. There's certainly activity, it's just not time intensive - this is both the blessing and apparent curse of using a high level language.
Had high hopes for BTSync, and we used it for several months on recommendation of a colleague. However, we found it was fairly buggy in terms of handling files with non-alpha numeric characters in it, and files just wouldn't sync. We suspect it might have been a rogue path parser for .SyncIgnore, not following unix file conventions (we were using it on OSX and Linux), but recently moved over to AeroFS and haven't had a problem since.
I do like BTSync and considered it, but I'm not the biggest fan of the idea that my data is stored on everyone's computer. AeroFS allows me some semblance of control, and as I grow, it will allow me to add users easily.
I won't know until I try it though, I could just as easily change my mind and use BTSync.
It doesn't put your data on everyone's computer. The data goes onto the computers that are sharing the folder. If you have two computers, the data is on two computers.
BTSync could be installed everywhere, but it only syncs between the folders with the same key. So if only your computers have your key(s) then only they would have contents of your folder(s).
Check out the recovery probabilities of raidz2 vs raid10. You're winning ~17% more space at the cost of a substantial performance hit, and almost no difference in reliability, depending on how you estimate HDD failure rates. Raidz2 is not fast when everything is working well, and if any of your drives develop a speed problem, the whole array will be reduced to that speed.
I have a 11x4T raidz2, and will be migrating to 3x4x4T raid10. My last array was 2x4x1.5T raidz1, 3 drive failures in its 5 year lifetime, but not concurrent so no data lost.
I use CrashPlan to get data from one place to another; I back up NAS to CrashPlan, PCs to CrashPlan, and PCs to NAS. If you don't use CrashPlan's app to back up to their cloud, it doesn't stop you from using it to back up to other machines you own. And it works on Windows, Mac, Linux, Solaris, and with a bit of hacking, FreeBSD.
Could you elaborate a bit on your CrashPlan setup? I have a somewhat similar setup and have been trying to back up my NAS (Synology DS1812+) to CrashPlan for about half a year now. I was able to start the backup, but the transfer rate quickly dropped to about 1.2 Mbps or lower, and having backed up about 1.5 TB the CrashPlan client now won't even run a backup.
Are you using Crashplan+ or any of the Pro/Enterprise plans? What sort of transfer rates do you get? And were you able to complete a full backup of that 11x4T array?
I'm using CrashPlan+ Family Unlimited (it's a home NAS!); I also bought a separate CrashPlan+ license when I first signed up. I don't have more than 1TB backed up specifically to CrashPlan as yet though. A big chunk of that NAS is for future growth, and a lot of it is a playground for experimenting with serving up media that we either have physical copies of or is of low replacement value.
I do see CrashPlan updating at a leisurely rate, but that's OK with me, as my upload bandwidth is limited. I'd be interested to see if it completely chokes past a certain point, like you say it does for you. But it's not yet at the point where I need to worry about it.
While you might be right about the small space advantage, I would feel more safe if the position of the failing drive (and the next failing drive) does not matter at all.
If a drive develops a speed problem, I'll replace it.
You might feel more safe, but you're not as safe as you feel.
Suppose that once one drive has failed, the chance of another drive failing (at least an URE, but potentially worse) while reading it entirely (as necessary during a scrub / resilver) is 10%. I don't think this is very implausible, as RAID HDD failures are often correlated, and not only because it's inconvenient to distribute HDD purchases.
The chance of successfully recovering in a mirrored scenario is 90%. You only need to read one drive.
In raidz1 with 6 drives, your probability is 0.9^5 - you need to get lucky 5 times, because you need to read every other drive fully to recover. That brings you down to less than 60%. Raidz1 with 5 drives is 0.9^4 - a bit over 65%.
Raidz2 for 6 drives combines these two stats. The probability of one failing disk while reading is 1-0.9^5. But you can recover from that. And the probability of that recovery is 0.9^4. So your consecutive probability of failures is (1-0.9^5)(1-0.9^4). Subtract that from 1 and you get your probability of getting lucky.
Work out the numbers, and you'll find it's only 86%. It's less than mirroring.
There are a couple of simplifications in the maths above (don't forget, a second drive failure means another drive to resilver, and a restart to the procedure), and a big assumption based on reliability %. But frankly, unless you think HDD failure is already fairly rare - and it's not - mirroring is usually ahead on safety, and the lead only increases with number of disks in the array.
I used to think like you do, that I'd feel more safe if a single disk failed, knowing I could still tolerate another failure. That's why my array is currently raidz2. But I worked out the numbers, and it changed my mind. And that's why my next array is going to be raid10.
Not to mention that it's hard to get much more than 150M/sec for any realistic I/O with raidz2. I have over 1G/sec of potential HDD bandwidth. Unless I'm doing a scrub, I can't touch the potential performance. And of course all disks act like they're on the same spindle, there is zero random I/O advantage for raidz.
Has anyone compared the reliability of this to Bittorrent's solution? I tried Aero a while ago and found it to be very buggy then. Are there any major differences between the two?
http://www.bittorrent.com/sync
The product has come a long way in the past year. In particular, our Private Cloud offering is deployed at a number of large organizations, all of which are quite happy.
Regarding comparison with BT Sync -- I think their product is great, but we try to go beyond simply syncing files between devices by allowing for a lot more administrative control to the IT organization while still giving the users a really simple Dropbox-like syncing experience. Things like remote wipe, version management, conflict resolution, and so on.
It depends on your definition of "helped". YC helped us in a lot of non tangible ways (the company probably wouldn't exist without their help, for example :).
But YC did not directly help us land these customers in the forms of introductions. For the most part, they've come to us through word of mouth and internal employee references.
We are a small company and are very happy with BTSync.
However, if you want tight admin control, you may have some issues. For example: If an employee leaves, we're not sure how to revoke privileges for them once they have access to the folder.
So, I recommend very small companies use BTSync since it's easy and free. But if you're a larger company, you have probably outgrown BTSync and should get something like AeroFS.
I had the same experience. I've been using Bittorrent Sync for awhile now and absolutely love it. It's super fast and I've never had any inconsistencies in syncing. The closed-source nature of it is not ideal, but if that's a concern you're no better off with AeroFS or Dropbox.
An "enterprise Dropbox" conversation is one that many customers will have with AeroFS. I know this from experience trying to sell this very type of product.
The challenge is that sales cycles are long and potentially high touch. One way to mitigate that is by getting sales distribution through 3rd parties. But avoid integrating with a bunch of 3rd party storage platforms unless you get commitments for leads from the vendors. In other words view integration efforts as an engineering to sales arbitrage.
I was just going to say, in my opinion this gives you less control over the whole environment and you have to trust AeroFS to fix any bugs and add features when/if they are needed.
I guess on the plus-side, you get phone support, if that is what you need.
Owncloud is open-source and really quite powerful, I am surprised it was not mentioned more in these comments.
Sounds like a great idea, especially for all the companies that are thinking about building their own cloud. I suspect building is their highest cost, and if by plugging AeroFS in that could be eliminated, that sounds like just the right way to go!
I am curious, what other sales mechanisms and/or software packages have you tried before you settled on private cloud w/out touching your servers idea?
This may be counter-futuristic, but what if you sold them servers, along with your software? What if you gave your clients the best in classes storage, coupled with the best way to manage it? I suspect you've thought about it before and I want to know what the reaction was like.
Enterprise clients are a black box for me, so anything else you share would be interesting to know.
Overall I think their approach for easy installation and configuration is a good one. I struggle with these same issues at my job--we sell products with a complex application stack to customers that often have no system administrators. The only issue I see here is with the upgrade path. Particularly for a product that is meant for file storage, I can't imagine downloading a 1TB backup file and uploading it again every time there is a new release.
The Appliance actually does not store any file data on it, so your appliance upgrades likely won't include 1TB backup files :)
The (optional) AeroFS Team Server is what would store file data in the company if you wanted to, but many of our customers actually just end up using the direct peer-to-peer syncing without a team server.
I've been using AeroFS for awhile on Linux(Mint & Debian).
I have to say it's pretty nice. I don't actually know what makes it better than dropbox and all the other choices; never looked into any advanced features. A buddy just sent me an invite & I started using it and now it's part of my workflow since it works reliably between my work & home machine.
It'd be interesting to know your process for creating and maintaining the appliances you distribute to clients and any tools/packages you chose to help do it.
You could try Filosync (my product). You can run the server part on any system with a JRE -- it's not a VM image. Just wget the jar file and java -jar jarfile. More instructions here: http://www.filosync.com/help/create-server-standalone
You're joking right? They already provide a tar.gz that you can probably hack around into installing it with puppet given enough motivation. But I'd rather have a proper package that I can upgrade automatically, etc.
So AeroFS had been in beta for like 4 years so they could do serverless peer to peer dropbox. And now they are launching a private cloud server product. Can't help but worry that these guys don't have any clear product they are committed to.