General Architecture for Text Engineering (GATE) is an open source full-lifecycle solution for a broad range of Natural Language Processing tasks. GATE excels at text analysis of all shapes and sizes.
A family of Processing Resources for language analysis is included in the shape of ANNIE, A Nearly-New Information Extraction system.
GATE includes resources for common LE data structures and algorithms, including documents, corpora and various annotation types, a set of language analysis components for Information Extraction and a range of data visualisation and editing components. It is a set of modules comprising a tokenizer, a gazetteer, a sentence splitter, a part of speech tagger, a named entities transducer and a coreference tagger. ANNIE can be used as-is to provide basic information extraction functionality, or provide a starting point for more specific tasks.
GATE is a mature solution having been in development for more than 15 years.
Features include:
- GATE Developer – an integrated development environment for language processing components bundled with a very widely used Information Extraction system and a comprehensive set of other plugins. GATE Developer is analogous to systems like Mathematica for Mathematicians, or JBuilder for Java programmers: it provides a convenient graphical environment for research and development of language processing software. As well as being a powerful research tool in its own right, it is also very useful in conjunction with GATE Embedded.
- GATE Teamware – a collaborative annotation environment for factory-style semantic annotation projects built around a workflow engine and a heavily-optimised backend service infrastructure.
- GATE Embedded – an object library optimised for inclusion in diverse applications giving access to all the services used by GATE Developer and more an architecture: a high-level organisational picture of how language processing software composition.
- JAPE, a Java Annotation Patterns Engine, provides regular-expression based pattern/action rules over annotations.
- GUK, the GATE Unicode Kit, fills in some of the gaps in the JDK’s support for Unicode.
- Plugins.
- Supports documents in a variety of formats including XML, RTF, email, HTML, PDF, SGML, Java Serial, PostgreSQL, Lucene, Oracle Databases with help of RDBMS storage over JDBC, and plain text.
- Internationalization support: English, Spanish, Chinese, Arabic, Bulgarian, French, German, Hindi, Italian, Cebuano, Romanian, and Russian.
Website: gate.ac.uk
Support: Documentation, Wiki, GitHub Code Repository
Developer: GATE research team Dept Computer Science, University of Sheffield
License: GNU Lesser General Public License
GATE is written in Java. Learn Java with our recommended free books and free tutorials.
Return to Natural Language Processing | Return to Java Natural Language Tools
Popular series | |
---|---|
The largest compilation of the best free and open source software in the universe. Each article is supplied with a legendary ratings chart helping you to make informed decisions. | |
Hundreds of in-depth reviews offering our unbiased and expert opinion on software. We offer helpful and impartial information. | |
The Big List of Active Linux Distros is a large compilation of actively developed Linux distributions. | |
Replace proprietary software with open source alternatives: Google, Microsoft, Apple, Adobe, IBM, Autodesk, Oracle, Atlassian, Corel, Cisco, Intuit, and SAS. | |
Awesome Free Linux Games Tools showcases a series of tools that making gaming on Linux a more pleasurable experience. This is a new series. | |
Machine Learning explores practical applications of machine learning and deep learning from a Linux perspective. We've written reviews of more than 40 self-hosted apps. All are free and open source. | |
New to Linux? Read our Linux for Starters series. We start right at the basics and teach you everything you need to know to get started with Linux. | |
Alternatives to popular CLI tools showcases essential tools that are modern replacements for core Linux utilities. | |
Essential Linux system tools focuses on small, indispensable utilities, useful for system administrators as well as regular users. | |
Linux utilities to maximise your productivity. Small, indispensable tools, useful for anyone running a Linux machine. | |
Surveys popular streaming services from a Linux perspective: Amazon Music Unlimited, Myuzi, Spotify, Deezer, Tidal. | |
Saving Money with Linux looks at how you can reduce your energy bills running Linux. | |
Home computers became commonplace in the 1980s. Emulate home computers including the Commodore 64, Amiga, Atari ST, ZX81, Amstrad CPC, and ZX Spectrum. | |
Now and Then examines how promising open source software fared over the years. It can be a bumpy ride. | |
Linux at Home looks at a range of home activities where Linux can play its part, making the most of our time at home, keeping active and engaged. | |
Linux Candy reveals the lighter side of Linux. Have some fun and escape from the daily drudgery. | |
Getting Started with Docker helps you master Docker, a set of platform as a service products that delivers software in packages called containers. | |
Best Free Android Apps. We showcase free Android apps that are definitely worth downloading. There's a strict eligibility criteria for inclusion in this series. | |
These best free books accelerate your learning of every programming language. Learn a new language today! | |
These free tutorials offer the perfect tonic to our free programming books series. | |
Linux Around The World showcases usergroups that are relevant to Linux enthusiasts. Great ways to meet up with fellow enthusiasts. | |
Stars and Stripes is an occasional series looking at the impact of Linux in the USA. |