Open Source Software List: The Ultimate List

http://www.datamation.com/open-source/

Accessibility

1. The Accessibility Project

The Business Value of Cisco UCS Integrated Infrastructure Solutions for Running SAP Workloads

Launched in 2013, this site aims to provide information on making other websites accessible to people with a variety of impairments, particularly those who are blind. You can read the content at the link above; if you'd like to contribute, visit the project's GitHub page. Operating System: OS Independent

Accounting

2. Edoceo Imperium

This web-based accounting package was created with small and medium-sized businesses (SMBs) in mind. It includes CRM, work order and invoice capabilities as well as standard accounting features. Check out the online demo to see it in action. Operating System: OS Independent

3. FrontAccounting

Another web-based accounting option for SMBs, FrontAccounting boasts inventory tracking and manufacturing management abilities. It's been downloaded more than 200,000 times. Operating System: OS Independent

4. GnuCash

GnuCash combines personal finance software with small business accounting software, which some small business owners find helpful. It can track investments, create graphs, import financial data, set up scheduled transactions and perform standard double-entry accounting. Operating System: Windows, Linux, OS X

5. LedgerSMB

LedgerSMB combines ERP and accounting capabilities in a single package, and it also includes a flexible development framework for extending its features. It has been downloaded more than 86,000 times since 2006. Operating System: Windows, Linux, OS X

6. TurboCASH

Used by more than 80,000 businesses, TurboCASH is a flexible accounting package that compares favorably with QuickBooks and Sage. It was created in the UK but also has a chart of accounts and currency features designed for U.S. businesses. Operating System: Windows

App Collection

7. OpenDisc

The OpenDisc project collects many of the most popular open source applications for Windows into one download. You can also get the project on a CD for a donation of $10. Operating System: Windows

Anti-Spam/Email Filtering

8. ASSP

ASSP claims to be "the absolute best SPAM fighting weapon that the world has ever known!" It offers easy, browser-based setup and works with most mail servers. Operating System: OS Independent.

9. MailScanner

Downloaded more than 1.3 million times, MailScanner is based on SpamAssassin and works with anti-virus software like ClamAV to protect mail servers at companies or ISPs. Support is available through third-party companies. Operating System: OS Independent.

10. Scrollout F1

This full-featured mail security solution incorporates anti-spam, anti-virus and other capabilities with an interface that the project creators say is as easy to use as a car radio. Paid support is available. Operating System: Windows, Linux.

11. SpamAssassin

This Apache project claims to be the "#1 Enterprise Open-Source Spam Filter." It uses a wide variety of methods to identify and block spam, and it works with nearly all mail servers. Operating System: primarily Linux and OS X, although Windows versions are available.

12. SpamBayes

SpamBayes uses statistical algorithms to calculate the probability that an incoming message is spam, and it adapts over time as spammers change their methods. It's available as a plug-in for many popular email services and clients, including Outlook, Thunderbird and others. Operating System: OS Independent.

Anti-Virus/Anti-Malware

13. ClamAV

One of the most popular open source security applications, ClamAV has been incorporated into many different products and has been called "the de facto standard for mail gateway scanning." The core program works on UNIX-based systems, but the website also offers information onImmunet, a ClamAV-based Windows solution that is available in both free and paid versions. Operating System: Linux, but front-ends and additional versions are available for other OSes.

14. ClamTk

This variation on ClamAV adds an easy-to-use GUI to the popular anti-virus engine. Now ten years old, this is a mature project that is included in many Linux distributions. Operating System: Linux.

15. ClamWin Free Antivirus

This Windows-based version of ClamAV boasts more than 600,000 users. It offers a scanning scheduler, integration with Windows Explorer and Outlook, automatic downloads of the updated malware database and support for Windows 7 and 8. Operating System: Windows.

Artificial Intelligence

16. Caffe

The brainchild of a UC Berkeley PhD candidate, Caffe is a deep learning framework based on expressive architecture and extensible code. It's claim to fame is its speed, which makes it popular with both researchers and enterprise users. According to its website, it can process more than 60 million images in a single day using just one NVIDIA K40 GPU. It is managed by the Berkeley Vision and Learning Center (BVLC), and companies like NVIDIA and Amazon have made grants to support its development.

17. CNTK

Short for Computational Network Toolkit, CNTK is one of Microsoft's open source artificial intelligence tools. It boasts outstanding performance whether it is running on a system with only CPUs, a single GPU, multiple GPUs or multiple machines with multiple GPUs. Microsoft has primarily utilized it for research into speech recognition, but it is also useful for applications like machine translation, image recognition, image captioning, text processing, language understanding and language modeling.

18. Deeplearning4j

Deeplearning4j is an open source deep learning library for the Java Virtual Machine (JVM). It runs in distributed environments and integrates with both Hadoop and Apache Spark. It makes it possible to configure deep neural networks, and it's compatible with Java, Scala and other JVM languages.

The project is managed by a commercial company called Skymind, which offers paid support, training and an enterprise distribution of Deeplearning4j.

19. Distributed Machine Learning Toolkit

Like CNTK, the Distributed Machine Learning Toolkit (DMTK) is one of Microsoft's open source artificial intelligence tools. Designed for use in big data applications, it aims to make it faster to train AI systems. It consists of three key components: the DMTK framework, the LightLDA topic model algorithm, and the Distributed (Multisense) Word Embedding algorithm. As proof of DMTK's speed, Microsoft says that on an eight-cluster machine, it can "train a topic model with 1 million topics and a 10-million-word vocabulary (for a total of 10 trillion parameters), on a document collection with over 100-billion tokens," a feat that is unparalleled by other tools.

20. H2O

Focused more on enterprise uses for AI than on research, H2O has large companies like Capital One, Cisco, Nielsen Catalina, PayPal and Transamerica among its users. It claims to make is possible for anyone to use the power of machine learning and predictive analytics to solve business problems. It can be used for predictive modeling, risk and fraud analysis, insurance analytics, advertising technology, healthcare and customer intelligence.

It comes in two open source versions: standard H2O and Sparkling Water, which is integrated with Apache Spark. Paid enterprise support is also available.

21. NuPIC

Managed by a company called Numenta, NuPIC is an open source artificial intelligence project based on a theory called Hierarchical Temporal Memory, or HTM. Essentially, HTM is an attempt to create a computer system modeled after the human neocortex. The goal is to create machines that "approach or exceed human level performance for many cognitive tasks."

In addition to the open source license, Numenta also offers NuPic under a commercial license, and it also offers licenses on the patents that underlie the technology.

22. OpenCyc

Developed by a company called Cycorp, OpenCyc provides access to the Cyc knowledge base and commonsense reasoning engine. It includes more than 239,000 terms, about 2,093,000 triples, and about 69,000 owl:sameAs links to external semantic data namespaces. It is useful for rich domain modeling, semantic data integration, text understanding, domain-specific expert systems and game AIs. The company also offers two other versions of Cyc: one for researchers that is free but not open source and one for enterprise use that requires a fee.

23. OpenNN

Designed for researchers and developers with advanced understanding of artificial intelligence, OpenNN is a C++ programming library for implementing neural networks. Its key features include deep architectures and fast performance. Extensive documentation is available on the website, including an introductory tutorial that explains the basics of neural networks. Paid support for OpenNNis available through Artelnics, a Spain-based firm that specializes in predictive analytics.

24. SystemML

First developed by IBM, SystemML is now an Apache big data project. It offers a highly-scalable platform that can implement high-level math and algorithms written in R or a Python-like syntax. Enterprises are already using it to track customer service on auto repairs, to direct airport traffic and to link social media data with banking customers. It can run on top of Spark or Hadoop.

25. TensorFlow

TensorFlow is one of Google's open source artificial intelligence tools. It offers a library for numerical computation using data flow graphs. It can run on a wide variety of different systems with single- or multi-CPUs and GPUs and even runs on mobile devices. It boasts deep flexibility, true portability, automatic differential capabilities and support for Python and C++. The website includes a very extensive list of tutorials and how-tos for developers or researchers interested in using or extending its capabilities.

26. Torch

Torch describes itself as "a scientific computing framework with wide support for machine learning algorithms that puts GPUs first." The emphasis here is on flexibility and speed. In addition, it's fairly easy to use with packages for machine learning, computer vision, signal processing, parallel processing, image, video, audio and networking. It relies on a scripting language called LuaJIT that is based on Lua.

Astronomy

27. Celestia

Travel virtually to anywhere in the known universe at any time with Celestia. It displays hundreds of thousands of celestial bodies as they would appear in the night skies. Operating System: Windows, Linux, OS X.

28. KStars

Similar to Stellarium, KStars lets users view "up to 100 million stars, 13,000 deep-sky objects, all 8 planets, the sun and moon, and thousands of comets and asteroids." It also includes a number of tools helpful for amateur astronomers, such as an observation list, an FOV editor, a sky calendar, supernova alerts and a glossary of technical terms. (Note that in order to use KStars on Windows, you'll have to download KDE for Windows.) Operating System: Windows, Linux

29. Stellarium

Another option for budding astronomers, this one confines the point of view to planet earth rather than allowing users to zoom throughout the universe, but it is so accurate that it is used by many planetariums. Operating System: Windows, Linux, OS X.

Audio Tools

30. Amarok

Amarok invites users to rediscover their music. It integrates with a variety of Web services and includes features like dynamic playlists, collection management, bookmarking, file tracking and import from other music databases, including iTunes. Operating System: Windows, Linux, OS X, iOS.

31. Ardour

Designed for use by professional audio engineers, musicians, soundtrack editors and composers, Ardour is a complete audio recording, mixing and editing suite. Key features include support for most hardware, flexible recording, unlimited multichannel tracks, unlimited undo/redo and much more. Operating System: Linux, OS X

32. aTunes

This Java-based music player and manager displays complete information—including lyrics—for the song currently playing. It's a good option for users with particularly large music collections. Operating System: OS Independent

33. Audacious

Unlike some audio players, Audacious doesn't use a lot of system resources, so it doesn't degrade system performance when you're using your PC for other tasks as well as listening to music. The latest update offers improved playlist shuffling, easier recording of Internet streams and a better equalizer interface. Operating System: Windows, Linux.

34. Audacity

A perennial favorite among Linux desktop users, Audacity gets hundreds of thousands of downloads per month. It was updated in July with new scrubbing and seeking features, preset effects and improved plug-in installation. Operating System: Windows, Linux, OS X

35. CDex

Downloaded more than 60 million times, CDex is a simple, handy tool for converting CDs to data files. It supports multiple file formats, including WAV, MP3, FLAC, AAC, WMA and OGG. Operating System: Windows.

36. Cdrtools

This suite of command-line tools includes the cdrecord CD/DVD/Blu-ray recording software, as well as tools for reading optical media, extracting audio, and more. It's a mature project that has been around for quite a few years. Operating System: Windows, Linux, OS X.

37. cdrtfe

Cdrtfe serves as a front-end for cdrtools and some other command-line recording applications. It can burn audio CDs, data discs, bootable discs, DVD-Video discs, ISO images and other types of optical media. The latest version supports Windows 10. Operating System: Windows.

38. Clementine

Based on an older version of Amarok, Clementine focuses on providing "a fast and easy-to-use interface for searching and playing your music." It supports Internet radio streams, cloud computing services like Dropbox and Google Drive, CUE sheets, tabbed playlists, audio CD playback and much more. Operating System: Windows, Linux, OS X, Android.

39. DeaDBeeF

This self-proclaimed "ultimate music player" supports a very long list of file formats. Key features include cue sheet support, tabbed playlists, cover art display, 18-band graphic equalizer, tag editor, gapless playback and more. Operating System: Linux, Unix, Android.

40. EasyTAG

EasyTAG allows users to view and edit the tag fields on MP3, MP2, MP4/AAC, FLAC, Ogg Vorbis, MusePack, Monkey's Audio, and WavPack files. It includes a tree-based browser and CDDB support for manual and automatic searches. Operating System: Windows, Linux

41. Exaile

Another option for Linux users, Exaile offers both playback and a powerful music manager. Key features include smart playlists, advanced track tagging, multiple plug-ins, automatic album art, lyrics and much more. Operating System: Linux.

42. FlacSquisher

This tool was made for audiophiles who like to keep their original music in the lossless FLAC file format. FlacSquisher converts those files to MP3s so that users can take them with them on mobile devices without taking up too much space. Operating System: Windows.

43. Fre:ac

Fre:AC stands for "free audio converter," and it can rip audio CDs or convert among numerous file formats. It's also portable, meaning that you can run it from a USB thumb drive without installing it on your system. Operating System: Windows.

44. Frinika

Java-based Frinika is a lightweight but fairly complete music workstation. It includes a sequencer, soft-synths, real-time effects and recording capabilities. Operating System: OS Independent

45. Giada

Giada describes itself as "a free, minimal, hardcore audio tool for DJs, live performers and electronic musicians." It's not quite as full-featured as the other options on the list, but it is an effective, lightweight looping tool. Operating System: Linux.

46. Guayadeque

Created for "all music enthusiasts," Guayadeque is a full-featured music management system that can handle large file collections. Noteworthy features include a configurable crossfader engine, configurable silence detector for gapless playback, labeling, smart play mode, last.fm support and more. Operating System: Linux

47. Hydrogen

"Professional yet simple and intuitive," Hydrogen is a drum machine for Linux only. The video on the site helps you quickly see how it works and what it can do. Operating System: Windows, Linux, OS X.

48. Jajuk

Java-based Jajuk works on multiple platforms. Aimed at advanced users, it offers a very full feature set as well as an intuitive interface. Operating System: OS Independent.

49. Jams

Formerly a paid app, Jams is now an open source Android music player with an elegant interface. It can connect to Google Play Music for purchasing songs and includes features like tag support, blacklisting, 9-band equalizer, scrobbling, crossfade, album art download and more. Operating System: Android.

50. KMid

This KDE app plays both Midi and karaoke files, making it easy for you to serenade your sweetheart. It includes a piano player interface and also accepts input from external keyboards. Operating System: Windows

51. Linux MultiMedia Studio (LMMS)

"Made by musicians, for musicians," LMMS is a full-featured music production system with plenty of presets and samples built in. Note that despite the word "Linux" in the name, it is available for Windows and OS X as well. Operating System: Windows, Linux, OS X

52. Mixxx

Made for professional DJs, Mixxx offers "everything you need to start making DJ mixes in a tight, integrated package." It supports more than 30 DJ MIDI controllers, integrates with iTunes and includes BPM detection and sync. Operating System: Windows, Linux, OS X.

53. MOC

Simply select a directory, and the MOC (Music On Console) audio player will play all files in that directory. Supported file formats include MP3, Ogg Vorbis, FLAC, Musepack, Speex, WAVE, AIFF, and AU. Operating System: Linux/Unix, OS X

54. Mp3splt

Mp3splt is an audio utility that does just one thing—it lets you cut mp3 and ogg files into smaller files and rename them. It’s especially useful if you need to split an entire album into individual tracks. Operating System: Windows, Linux, OS X

55. MuseScore

If you are a musician, teacher or composer interested in generating your own sheet music, MuseScore makes it very easy and offers most of the same features you'll find in the proprietary software. The website includes some tutorials and plenty of other help to get you started, and the interface is very intuitive. Operating System: Windows, Linux, OS X

56. Nightingale

Nightingale promises users "a beautiful interface with a wide range of supported audio formats, all with multi-platform support." It has a large library of add-ons that extend its capabilities. Operating System: Windows, Linux, OS X.

57. orDrumbox

Another open source option for creating your own drum loops and feeds, orDrumbox offers an easy-to-use interface. Features include auto-composition, poly-rhythms, an arpeggiator, automatic sounds/track matching , custom soft-synths and low-fi rendering. Operating System: Windows, Linux, OS X.

58. Qmmp

Qmmp, which stands for "Qt-based MultiMedia Player," offers features like support for skins, 10-band equalizer, streaming playback, cover art, cue sheet support and multiple playlists. Its interface is very simple and similar to older apps like Winamp and XMMS. Operating System: Windows, Linux

59. Radio Downloader

If your favorite online radio station only offers streaming content, you can turn it into a podcast you can listen to any time with Radio Downloader. It comes with built-in support for BBC content and a helpful "favourites" tab. Operating System: Windows

60. Rhythmbox

Rhythmbox is a Linux-only audio player for the GNOME desktop. The interface and feature set are fairly basic. Operating System: Linux.

61. SoX

This cross-platform command line tool calls itself the "Swiss Army knife of sound processing programs." It can convert files from one type to another, record and play audio files, and add effects. Operating System: Windows, Linux, OS X.

62. TEncoder

This app provides an interface to three other popular open source video tools: FFMPEG, MEncoder and Mplayer. It can convert video files, rip unprotected DVDs, add subtitles, download from YouTube, extract audio or video and more. Operating System: Windows.

63. XiX Music Player

This cross-platform player supports album art and lyrics, reverse play, crossfading, trimming, shuffle, repeat, song rating, search, and more. It's also small enough to run on a Raspberry Pi board. Operating System: Windows, Linux, OS X

64. xwax

This Linux-only tool was designed for beat mixing and scratch mixing. Features include needle drops, pitch changes, scratching, spinbacks and rewinds. Operating System: Linux.

65. Yoshimi

Yoshimi is a Linux-only software synthesizer forked from an older version of ZynAddSubFX. The project name comes from a song by The Flaming Lips. Operating System: Linux.

66. ZynAddSubFX

This software synthesizer comes in Windows and Linux versions. Features include real-time, polyphonic, multitimbral and microtonal capabilities and a long list of effects and filters. Operating System: Windows, Linux.

Backup

67. AMANDA

The Advanced Maryland Automatic Network Disk Archiver, or AMANDA, is a popular network backup solution that can save data from Linux, Unix or Windows systems to hard drives, tape or optical media. Zmanda, which sponsors the project, offers commercial products based on the same technology. Operating System: Windows, Linux, OS X.

68. Areca Backup

For standalone systems, Area is an easy-to-use but versatile backup solution. Key features include delta backup, compression, encryption, filters, as-of-date recovery and more. Operating System: Windows, Linux

69. Attic

If you are looking to minimize the amount of storage space you need for backups, consider Attic, which includes built-in deduplication. It also includes optional 256-bit AES encryption and can transfer files to remote hosts via SSH. Operating System: Linux

70. Backup

This Ruby-based tool promises "easy full stack backup operations on UNIX-like systems." It includes a tool for modeling backups. Operating System: Linux, OS X

71. Backupninja

This tool makes it easier to coordinate and manage backups on your network. It incorporates several of the other tools on this list including Duplicity and rsync. Operating System: Linux

72. BackupPC

Robust enough for enterprise use, BackupPC backs up data from Linux and Windows systems to disk. Noteworthy features include a unique pooling scheme, optional compression, a web interface and support for mobile devices. Operating System: Windows, Linux

73. Back In Time

Inspired by an older solution called FlyBack, Back in Time takes snapshots of specified directories. It's easy to setup and includes a simple scheduler. Operating System: Linux

74. Bacula

Another option for enterprises, Bacula is a network backup solution that aims to be easy to use and very efficient. Commercial support and services for the solution are available throughBacula Systems. Operating System: Windows, Linux, OS X

75. Bareos

Forked from Bacula, Bareos is a popular open source backup option that is under very active development. Bareos.com offers paid support and services for the tool. Operating System: Windows, Linux, OS X

76. Box Backup

This "completely automatic" backup solution creates backups continuously and can also create snapshots when desired. It includes encryption and optional RAID capabilities. Operating System: Windows, Linux

77. BURP

Short for "BackUp And Restore Program," BURP is a network backup solution based on librsync (see below). It is designed to be easier to configure than some other open source solutions, and it can do delta backups. Operating System: Windows, Linux

78. Clonezilla

Designed to replace Acronis True Image or Norton Ghost, Clonezilla is useful for both system deployment and backup and recovery. It comes in two flavors: live for standalone systems and SE for network backup or cloning multiple systems at once. Operating System: Linux

79. Create Synchronicity

Powerful but lightweight, this backup tool takes up only 220KB of space on your drive. It supports multiple languages, has an intuitive interface and includes a scheduler. Operating System: Windows

80. DAR

Disk Archive, a.k.a. DAR, is an older command-line tool for backup. For those who prefer a GUI, one is available through DarGUI. Operating System: Windows, Linux, OS X

81. DirSync Pro

This "small but powerful," utility offers incremental backup, filtering and scheduling capabilities. It also boasts a user-friendly interface, and it offers the ability to analyze two sets of files or folders and detect the changes between them. Operating System: Windows

82. DriverBackup!

While this utility isn't a complete system backup solution, it does back up Windows drivers. It can also remove unwanted drivers. Operating System: Windows

83. Duplicity

Based on the librsync library, Duplicity creates encrypted archives and uploads them to remote or local servers. It can use GnuPG to encrypt and sign archives if desired. Operating System: Linux

84. FOG

FOG offers cross-platform cloning and imaging capabilities for networks of any size from 5 to 50,000 systems. It boasts that it "offers commercial-grade support at no cost." Operating System: Linux, Windows, OS X.

85. FreeFileSync

A tool for standalone systems, FreeFileSync aims to save users time when setting up and running backups. It is cross-platform and includes 64-bit support. Operating System: Linux, Windows, OS X

86. FullSync

Although it was designed to help web developers push updates to their sites, FullSync can also be used by anyone to create backups. Key features include multiple modes, flexible rules, buffered filesystems, support for multiple file transfer protocols and more. Operating System: Linux, Windows, OS X

87. Grsync

Grsync takes the older rsync synchronization tool and adds an easy-to-use GUI. Noteworthy features include unlimited sessions, highlighted errors, batch capabilities and more. Operating System: Linux, Windows, OS X

88. LuckyBackup

Like Grsync, LuckyBackup was also based on rsync. It has won several awards, but development on this project has slowed. Operating System: Linux, Windows

89. Mondo Rescue

For Linux and FreeBSD only, Mondo Rescue is a disaster recovery solution that supports tape, disk, network or optical media backups. According to its website, its users include "Lockheed-Martin, Nortel Networks, Siemens, HP, IBM, NASA's JPL, the US Dept of Agriculture, dozens of smaller companies." Operating System: Linux, Free BSD

90. Obnam

Easy-to-use and secure, Obnam is a snapshot backup solution with built-in deduplication and encryption capabilities. It stores data to hard disks or online via SFTP. Operating System: Linux

91. Partimage

This tool saves partitions of drives as image files, making it useful for backup or installing the same image on multiple systems. It can run across networks or on a standalone PC. Operating System: Linux

92. Redo

Redo boasts that it can get a crashed system back up and running in as little as 10 minutes. It's very easy to use and has bare-metal restore capabilities. Operating System: Windows, Linux

93. Rsnapshot

As you might expect from the name, this utility makes a snapshot of your file system for remote or local backup. According to the website, it can be set up in just a few minutes. Operating System: Linux, OS X

94. Rsync

Rsync is a Unix-based file-transfer utility that has synchronization capabilities that make it suitable for creating backups or mirroring. It's a useful tool but is best used by advanced users. Operating System: Linux, Windows, OS X

95. SafeKeep

For Linux users only, SafeKeep focuses on security and simplicity. It's a command line tool that is a good option for a small LAN. Operating System: Linux

96. SMS Backup+

This tool allows you to backup your text messages and call logs on Gmail. You can also transfer data from Gmail back to your phone. Operating System: Android

97.SnapBackup

Designed to be as easy to use as possible, SnapBackup backs up files with just one click. It can copy files to a flash drive, external hard drive or the cloud, and it includes compression capabilities. Operating System: Windows, Linux, OS X

98. Synkron

While this app is focused primarily on synchronization, it can be used for creating backups as well. Key features include analysis capabilities, blacklisting, restores and cross-platform support. Operating System: Windows, Linux, OS X

99. Unison

Like Synkron, Unison is a file synchronization tool. It can copy files between any two systems connected to the internet, and it has features in common with source code management tools as well as with backup utilities. Operating System: Windows, Unix

100. UrBackup

This client-server backup solution does both image and file backups. It promises "both data safety and a fast restoration time." Operating System: Windows, Linux

101. Weex

The Weex developers intended it primarily as a tool for pushing content to websites, but it can also be used to synchronize or backup files. It supports FTP file transfer. Operating System: Windows, Linux

102. Win32DiskImager

Averaging more than 50,000 downloads every week, this tool is a very popular way to copy a disk image to a new machine. It's very useful for systems administrators and developers. Operating System: Windows

103. XSIbackup

XSIbackup can backup VMwareESXi environments version 5.1 or greater. It's a command line tool with a scheduler, and it runs directly on the hypervisor. Operating System: VMwareESXi

Big Data Tools

104. Alluxio

Formerly known as Tachyon, Alluxio describes itself as "a memory-centric distributed storage system enabling reliable data sharing at memory-speed across cluster frameworks." It works with tools like Spark and Hadoop to speed performance on big data queries. Operating System: Linux, OS X

105. Ambari

Part of the Hadoop ecosystem, this Apache project offers an intuitive Web-based interface for provisioning, managing, and monitoring Hadoop clusters. It also provides RESTful APIs for developers who want to integrate Ambari's capabilities into their own applications. Operating System: Windows, Linux, OS X.

106. Avro

This Apache project provides a data serialization system with rich data structures and a compact format. Schemas are defined with JSON and it integrates easily with dynamic languages. Operating System: OS Independent.

107. Cascading

Cascading is an application development platform based on Hadoop. Commercial support and training are available. Operating System: OS Independent.

108. Chukwa

Based on Hadoop, Chukwa collects data from large distributed systems for monitoring purposes. It also includes tools for analyzing and displaying the data. Operating System: Linux, OS X.

109. Data Torrent RTS

Data Torrent has been around a while, but it first open sourced its Core RTS technology in June of this year. It claims to be "the industry's only open source enterprise-grade unified stream and batch platform." It comes in community, standard and enterprise versions. Operating System: Linux

110. Disco

Originally developed by Nokia, Disco is a distributed computing framework that, like Hadoop, is based on MapReduce. It includes a distributed filesystem and a database that supports billions of keys and values. Operating System: Linux, OS X.

111. Flume

Flume collects log data from other applications and delivers them into Hadoop. The website boasts, "It is robust and fault tolerant with tunable reliability mechanisms and many failover and recovery mechanisms." Operating System: Linux, OS X.

112. Genie

Created by Netflix, Genie allows IT administrators to manage Hadoop jobs running on cloud computing services. Netflix uses it to run many thousands of Hadoop jobs every day. Operating System: Windows, Linux, OS X

113. Hadoop

This Apache-sponsored project is the best-known big data tool available. Numerous companies, including Amazon Web Services, Cloudera, Hortonworks, IBM, Pivotal, SyncSort and VMware, offer related products or commercial support for Hadoop. Well-known users include Alibaba, AOL, eBay, Facebook, Google, Hulu, LinkedIn, Spotify, Twitter and Yahoo. Operating System: Windows, Linux, OS X

114. Hadoop Distributed File System

HDFS is the file system for Hadoop, but it can also be used as a standalone distributed file system. It's Java-based, fault-tolerant, highly scalable and highly configurable. Operating System: Windows, Linux, OS X.

115. HPCC

This alternative to Hadoop also offers massive parallel processing and storage of big data workloads. Paid enterprise services are available. Operating System: Linux

116. Hypertable

Very popular with Web companies, Hypertable was developed by Google as a way to make databases more scalable. Its users include Baidu, eBay, Groupon and Yelp. It is compatible with Hadoop, and commercial support and training are available. Operating System: Linux, OS X

117. Ignite

This Apache project describes itself as "a high-performance, integrated and distributed in-memory platform for computing and transacting on large-scale data sets in real-time, orders of magnitude faster than possible with traditional disk-based or flash technologies." The platform includes data grid, compute grid, service grid, streaming, Hadoop acceleration, advanced clustering, file system, messaging, events and data structure capabilities. Operating System: OS Independent.

118. Kudu

Currently in beta trials, Kudu is an Apache project that is part of the Hadoop ecosystem. It combines a simple data model with columnar storage, low latency and distributed architecture. Operating System: Windows, Linux, OS X

119. Lipstick

This Netflix project provides an easy-to-understand graphical representation of Hadoop Pig jobs. It updates as the job executes so that administrators and developers no longer need to sift through log data. Operating System: Windows, Linux, OS X

120. Lucene

Java-based Lucene performs full-text searches very quickly. According to the website, it can index more than 150GB per hour on modern hardware, and it includes powerful and efficient search algorithms. Development is sponsored by the Apache Software Foundation. Operating System: OS Independent.

121. Lumify

Created by a company called Altamira Technologies, Lumify describes itself as an "open source big data analysis and visualization platform." It makes it easy to create 2D or 3D graphs that show the relationship between entities or to overlay data on maps. For those who are interested in learning more about how it works, the website offers several videos that show Lumify in action, and it also has a demo site that allows users to upload their own data and try out the software. Operating System: Linux.

122. MapReduce

An integral part of Hadoop, MapReduce is a programming model that provides a way to process large distributed datasets. It was originally developed by Google, and it also used by several other big data tools on our list, including CouchDB, MongoDB and Riak. Operating System: OS Independent.

123. Mesos

Apache Mesos is a resource abstraction tool that makes it possible for enterprises to treat their entire data center as a single pool of resources, and it is popular with companies that are also running Hadoop, Spark and similar applications. Organizations that use it include Airbnb, CERN, Cisco, Coursera, Foursquare, Groupon, Netflix, Twitter and Uber. Operating System: Linux, OS X

124. Oozie

This workflow scheduler is specifically designed to manage Hadoop jobs. It can trigger jobs by time or by data availability, and it integrates with MapReduce, Pig, Hive, Sqoop and many other related tools. Operating System: Linux, OS X.

125. Pandas

The Pandas project includes data structures and data analysis tools based on the Python programming language. It allows organizations to use Python as an alternative to R for big data analysis projects. Operating System: Windows, Linux, OS X.

126. Pig

Apache Pig is a platform for distributed big data analysis. It relies on a programming language called Pig Latin, which boasts simplified parallel programming, optimization and extensibility. Operating System: OS Independent.

原文地址:https://www.cnblogs.com/timssd/p/6275193.html