Want to take your software engineering career to the next level? Join the mailing list for career tips & advice Click here

pyleus

Pyleus is a Python framework for developing and launching Storm topologies.

Subscribe to updates I use pyleus


Statistics on pyleus

Number of watchers on Github 394
Number of open issues 69
Average time to close an issue 21 days
Main language Python
Average time to merge a PR 7 days
Open pull requests 7+
Closed pull requests 6+
Last commit about 4 years ago
Repo Created almost 6 years ago
Repo Last Updated over 2 years ago
Size 1.42 MB
Organization / Authoryelparchive
Contributors15
Page Updated
Do you use pyleus? Leave a review!
View open issues (69)
View pyleus activity
View on github
Book a Mock Interview With Me (Silicon Valley Engineering Leader, 100s of interviews conducted)
Software engineers: It's time to get promoted. Starting NOW! Subscribe to my mailing list and I will equip you with tools, tips and actionable advice to grow in your career.
Evaluating pyleus for your project? Score Explanation
Commits Score (?)
Issues & PR Score (?)

Pyleus

Pyleus is a Python 2.6+ framework for developing and launching Apache Storm_ topologies.

Please visit our documentation_.

=============== ================ master develop =============== ================ |master-status| |develop-status| =============== ================

.. |master-status| image:: https://travis-ci.org/YelpArchive/pyleus.svg?branch=master :target: https://travis-ci.org/YelpArchive/pyleus

.. |develop-status| image:: https://travis-ci.org/YelpArchive/pyleus.svg?branch=develop :target: https://travis-ci.org/YelpArchive/pyleus

About

Pyleus is a framework for building Apache Storm topologies in idiomatic Python.

With Pyleus you can:

  • define a topology with a simple YAML file

  • have dependency management with a requirements.txt file

  • run faster thanks to Pyleus MessagePack_ based serializer

  • pass options to your components directly from the YAML file

  • use the Kafka spout built into Storm with only a YAML change

Install

From PyPI:

.. code-block:: shell

$ pip install pyleus

Note:

You do NOT need to install pyleus on your Storm cluster. Thats cool, isn't it?

However, if you are going to use system_site_packages: true in your config file, you should be aware that the environment of your Storm nodes needs to match the one on the machine used for building the topology. This means you actually have to install pyleus on your Storm cluster in this case.

Try it out!

.. code-block:: shell

$ git clone https://github.com/Yelp/pyleus.git $ pyleus build pyleus/examples/exclamation_topology/pyleus_topology.yaml $ pyleus local exclamation_topology.jar

Or, submit to a Storm cluster with:

.. code-block:: shell

$ pyleus submit -n NIMBUS_HOST exclamation_topology.jar

The examples_ directory contains several annotated Pyleus topologies that try to cover as many Pyleus features as possible.

Pyleus command line interface

  • Build a topology:

.. code-block:: shell

 $ pyleus build /path/to/pyleus_topology.yaml
  • Run a topology locally:

.. code-block:: shell

 $ pyleus local /path/to/topology.jar
  • Submit a topology to a Storm cluster:

.. code-block:: shell

 $ pyleus submit [-n NIMBUS_HOST] /path/to/topology.jar
  • List all topologies running on a Storm cluster:

.. code-block:: shell

 $ pyleus list [-n NIMBUS_HOST]
  • Kill a topology running on a Storm cluster:

.. code-block:: shell

 $ pyleus kill [-n NIMBUS_HOST] TOPOLOGY_NAME

Try pyleus -h for a list of all the available commands or pyleus CMD -h for any command-specific help.

Write your first topology

Please refer to the documentation_ for a more detailed tutorial.

Organize your files

This is an example of the directory tree of a simple topology:

.. code-block:: none

my_first_topology/ |-- my_first_topology/ | |-- init.py | |-- dummy_bolt.py | |-- dummy_spout.py |-- pyleus_topology.yaml |-- requirements.txt

Define the topology layout

A simple pyleus_topology.yaml should look like the following:

.. code-block:: yaml

name: my_first_topology

topology:

   - spout:
       name: my-first-spout
       module: my_first_topology.dummy_spout

   - bolt:
       name: my-first-bolt
       module: my_first_topology.dummy_bolt
       groupings:
           - shuffle_grouping: my-first-spout

This defines a topology where a single bolt subscribes to the output stream of a single spout. As simple as it is.

Write your first spout

This is the code implementing dummy_spout.py:

.. code-block:: python

from pyleus.storm import Spout

class DummySpout(Spout):

   OUTPUT_FIELDS = ['sentence', 'name']

   def next_tuple(self):
       self.emit(("This is a sentence.", "spout",))

if name == 'main': DummySpout().run()

Write your first bolt

Let's now look at dummy_bolt.py:

.. code-block:: python

from pyleus.storm import SimpleBolt

class DummyBolt(SimpleBolt):

   OUTPUT_FIELDS = ['sentence']

   def process_tuple(self, tup):
       sentence, name = tup.values
       new_sentence = "{0} says, \"{1}\"".format(name, sentence)
       self.emit((new_sentence,), anchors=[tup])

if name == 'main': DummyBolt().run()

Run your topology

Run the topology on your local machine for debugging:

.. code-block:: shell

pyleus build my_first_topology/pyleus_topology.yaml pyleus local --debug my_first_topology.jar

When you are done, hit C-C.

Configuration File

You can set default values for many configuration options by placing a .pyleus.conf file in your home directory:

.. code-block:: none

[storm] nimbus_host: 10.11.12.13 jvm_opts: -Djava.io.tmpdir=/home/myuser/tmp

[build] pypi_index_url: http://pypi.ninjacorp.com/simple/

Reference

  • Apache Storm Documentation_

License

Pyleus is licensed under the Apache License, Version 2.0: http://www.apache.org/licenses/LICENSE-2.0

.. _Apache Storm: https://storm.apache.org/ .. _Apache Storm Documentation: https://storm.apache.org/documentation/Home.html .. _MessagePack: http://msgpack.org/ .. _documentation: http://pyleus.org/ .. _examples: https://github.com/Yelp/pyleus/tree/master/examples

pyleus open issues Ask a question     (View All Issues)
  • almost 4 years Cannot do relative imports from outside the topology folder
  • about 4 years https://yelp.github.io/pyleus/ 404!
  • about 4 years Error when running storm topology using new storm release-1.0.0 and pyleus-0.3.0
  • over 4 years NoClassDefFoundError: backtype/storm/topology/IRichBolt
  • over 4 years How can I write result into MySQL?
  • over 4 years How can I set the path of py files that are extracted from jar?
  • over 4 years run word_count error
  • over 4 years I use remote host as kafka server, how can I configure the kafka server ip and port in kafka_spout options.
  • over 4 years How can I write kafka consumer offset to other zookeepers
  • over 4 years getting IOError: [Errno 36] File name too long on ubuntu
  • over 4 years Workers connections / b.s.util [ERROR] Async loop died!
  • over 4 years about the use of other log management software with pyleus? help!!
  • almost 5 years pyleus and the use of logentries
  • almost 5 years How to define global options in "pyleus_topology.yaml" for all components?
  • almost 5 years Is it possible that one bolt kills the topology?
  • almost 5 years why all tasks attain a same object to excute process
  • almost 5 years Contribute MessagePackSerializer upstream
  • almost 5 years Joining forces with streamparse devs for common pystorm package
  • almost 5 years how to write drpc with pyleus?
  • about 5 years Question for using kafka-client
  • about 5 years KafkaOffsetMonitor can't monitor kafka status when run locally?
  • about 5 years Cannot build on OSX when submitting to Ubuntu cluster
  • over 5 years Pyleus ignoring virtualenv
  • over 5 years bolt died because the read_tuple() has TypeError: 'int' object has no attribute '__getitem__'
  • over 5 years Using kafka-spout, bolt dies after a few hours of running
  • over 5 years Add support for Multilang Metrics Feature (STORM-200)
  • over 5 years topology definition not taking effect in local mode [tune #ackers and parallelism
  • over 5 years Native Cassandra bolt support
  • over 5 years Try out msgpack-java v0.7 support for nested maps and arrays
  • over 5 years Add support for adding externally defined spouts and bolts to pyleus topologies
pyleus open pull requests (View All Pulls)
  • Storm 0 10 0
  • Add Python 3.3+ and Windows support
  • API for adding java-based spouts to Pyleus topologies
  • Added --always-copy to virtualenv call
  • Depend on simplejson for Python < 2.7
  • fix the error when build under Chinese OS with default GBK encode
  • Dropping support for python 3<x<3.6
pyleus questions on Stackoverflow (View All Questions)
  • IRichBolt Error when running topology on storm-1.0.0 and pyleus-0.3.0
  • Using pyleus: NoClassDefFoundError: backtype/storm/topology/IRichBolt
  • how to handle exceptions happen in pyleus Storm tasks
  • How to run pyleus on Storm
pyleus list of languages used
Other projects in Python