Technology moves fast! ⚡ Don't get left behind.🚶 Subscribe to our mailing list to keep up with latest and greatest in open source projects! 🏆


Subscribe to our mailing list

libact

Pool-based active learning in Python

Subscribe to updates I use libact


Statistics on libact

Number of watchers on Github 350
Number of open issues 25
Average time to close an issue 21 days
Main language Python
Average time to merge a PR 3 days
Open pull requests 10+
Closed pull requests 4+
Last commit over 1 year ago
Repo Created over 3 years ago
Repo Last Updated about 1 year ago
Size 1.81 MB
Homepage http://libact.rea...
Organization / Authorntucllab
Latest Releasev0.1.3
Contributors6
Page Updated
Do you use libact? Leave a review!
View open issues (25)
View libact activity
View TODOs for libact (6)
View on github
Fresh, new opensource launches 🚀🚀🚀
Trendy new open source projects in your inbox! View examples

Subscribe to our mailing list

Evaluating libact for your project? Score Explanation
Commits Score (?)
Issues & PR Score (?)

libact: Pool-based Active Learning in Python

authors: Yao-Yuan Yang, Shao-Chuan Lee, Yu-An Chung, Tung-En Wu, Si-An Chen, Hsuan-Tien Lin

Build Status Documentation Status PyPI version codecov.io Week Stars

Introduction

libact is a Python package designed to make active learning easier for real-world users. The package not only implements several popular active learning strategies, but also features the active-learning-by-learning meta-algorithm that assists the users to automatically select the best strategy on the fly. Furthermore, the package provides a unified interface for implementing more strategies, models and application-specific labelers. The package is open-source along with issue trackers on github, and can be easily installed from Python Package Index repository.

Documentation

The technical report associated with the package is on arXiv, and the documentation for the latest release is available on readthedocs. Comments and questions on the package is welcomed at libact-users@googlegroups.com. All contributions to the documentation are greatly appreciated!

Basic Dependencies

  • Python 2.7, 3.3, 3.4, 3.5

  • Python dependencies

    pip install -r requirements.txt
    
  • Debian (>= 7) / Ubuntu (>= 14.04)

    sudo apt-get install build-essential gfortran libatlas-base-dev liblapacke-dev python3-dev
    
  • macOS

    brew install openblas
    

Installation

After resolving the dependencies, you may install the package via pip (for all users):

sudo pip install libact

or pip install in home directory:

pip install --user libact

or pip install from github repository for latest source:

pip install git+https://github.com/ntucllab/libact.git

To build and install from souce in your home directory:

python setup.py install --user

To build and install from souce for all users on Unix/Linux:

python setup.py build
sudo python setup.py install

Usage

The main usage of libact is as follows:

qs = UncertaintySampling(trn_ds, method='lc') # query strategy instance

ask_id = qs.make_query() # let the specified query strategy suggest a data to query
X, y = zip(*trn_ds.data)
lb = lbr.label(X[ask_id]) # query the label of unlabeled data from labeler instance
trn_ds.update(ask_id, lb) # update the dataset with newly queried data

Some examples are available under the examples directory. Before running, use examples/get_dataset.py to retrieve the dataset used by the examples.

Available examples:

  • plot : This example performs basic usage of libact. It splits a fully-labeled dataset and remove some label from dataset to simulate the pool-based active learning scenario. Each query of an unlabeled dataset is then equivalent to revealing one labeled example in the original data set.
  • label_digits : This example shows how to use libact in the case that you want a human to label the selected sample for your algorithm.
  • albl_plot: This example compares the performance of ALBL with other active learning algorithms.
  • multilabel_plot: This example compares the performance of algorithms under multilabel setting.
  • alce_plot: This example compares the performance of algorithms under cost-sensitive multi-class setting.

Running tests

To run the test suite:

python setup.py test

To run pylint, install pylint through pip install pylint and run the following command in root directory:

pylint libact

To measure the test code coverage, install coverage through pip install coverage and run the following commands in root directory:

coverage run --source libact --omit */tests/* setup.py test
coverage report

Citing

If you find this package useful, please cite the original works (see Reference of each strategy) as well as the following

@techreport{YY2017,
  author = {Yao-Yuan Yang and Shao-Chuan Lee and Yu-An Chung and Tung-En Wu and Si-An Chen and Hsuan-Tien Lin},
  title = {libact: Pool-based Active Learning in Python},
  institution = {National Taiwan University},
  url = {https://github.com/ntucllab/libact},
  note = {available as arXiv preprint \url{https://arxiv.org/abs/1710.00379}},
  month = oct,
  year = 2017
}

Acknowledgments

The authors thank Chih-Wei Chang and other members of the Computational Learning Lab at National Taiwan University for valuable discussions and various contributions to making this package better.

libact open issues Ask a question     (View All Issues)
  • over 2 years Can win10 system install this? Or must Linux/macOS?
  • over 2 years Dataset loading utilities
  • over 2 years More examples with sphinx-gallery
  • over 2 years moving sklearn.cross_validation to sklearn.model_selection after v0.18.0
  • over 2 years Supporting multi-label active learning problems.
  • almost 3 years Allow make_query to return multiple items (or the entire scored set)
  • about 3 years Enhancement for unit testing
  • about 3 years add example usage into docstring
  • about 3 years Next stage
  • about 3 years Documents on implementing their own algorithm on this framework
  • about 3 years IdealLabeler error using numpy 1.11.0b3
  • over 3 years Unit testing for active learning algorithms
  • over 3 years Raise exception in Ideal_labeler when the given feature is not found
  • over 3 years Identify whether the relabeling in sklearn will cause problem
  • over 3 years scikit-learn model adapter
  • over 3 years Developer guidelines
  • over 3 years QS: check if unlabeled pool is empty upon update
  • over 3 years Use setuptools instead of distutils
  • over 3 years Dataset: specify numbers of labels at constructor
libact open pull requests (View All Pulls)
  • error handling when input sample is not found in the dataset
  • WIP: Use setuptools instead of distutils
  • WIP: Remove functions import_libsvm_sparse and import_scipy_mat
  • Documents
  • test code for uncertainty sampling
  • extend quire for kernels other than rbf
  • sklearn adapter
  • ALCE
  • Multilabel active learning support
  • Ml quire
libact list of languages used
libact latest release notes
v0.1.3 Official version v0.1.3

Documentation polished.

v0.1.3b1 HierarchicalSampling

Added HierarchicalSampling and update author list

v0.1.3b0 Extending the scope of Libact

Added more algorithm implementations

  • Binary Classification
    • #72 Density Weighted Uncertainty Sampling
    • #79 Uncertainty Sampling (entropy)
    • #80 Query By Committee (KL-divergence)
    • #98 Expected Error Reduction
  • Multilabel Classification
    • #74 Maximum loss reduction with Maximal Confidence
    • #83 Multilabel active learning with Auxiliary Learner
    • #85 Binary Minimization
    • #91 Adaptive active learning
  • Cost-Sensitive Multiclass Classification
    • #76 Active Learning with Cost Embedding

Model

  • #70 Scikit-learn model adapter
  • Multilabel classification model

Documentation

Other projects in Python