avro-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From kojirom...@apache.org
Subject [avro] branch master updated: AVRO-831 Refactor lang/py setup and test structure (#733)
Date Wed, 11 Dec 2019 00:30:57 GMT
This is an automated email from the ASF dual-hosted git repository.

kojiromike pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/avro.git


The following commit(s) were added to refs/heads/master by this push:
     new 9ab19ee  AVRO-831 Refactor lang/py setup and test structure (#733)
9ab19ee is described below

commit 9ab19ee49678e930beeb05d2f1c7fedbd663223d
Author: Michael A. Smith <michael@smith-li.com>
AuthorDate: Tue Dec 10 19:30:45 2019 -0500

    AVRO-831 Refactor lang/py setup and test structure (#733)
    
    * AVRO-831: Pythonic Build
    
    * AVRO-831: Rework lang/py Setup
    
    Decouple python setup from ant and enable tests to run without it.
    
    * AVRO-831: Run Tests Normally
    
    * AVRO-831: Skip Java Tether if No JDK
    
    Make it easier to test on systems without Java
    
    * AVRO-831: Modern Exception Syntax
    
    * AVRO-831: Exterminate Ants
    
    * AVRO-831: Found One More Ant
    
    * AVRO-831: Remove pdb
    
    * AVRO-831: iSort Order
    
    * AVRO-831: Rename Build Subcommand
    
    * AVRO-831: Correct PYTHONPATH in Interop Tests
    
    * AVRO-831: Skip Test if JAR Not Found
    
    * AVRO-831: Set Package Version Correctly
    
    Fixes an error where the package metadata version is not used, causing setuptools to ignore it.
    
    Co-Authored-By: RyanSkraba <ryan@skraba.com>
    
    * AVRO-831: Ignore More Files Created by Setup
    
    Adds additional files that are created during setup to .gitignore because they are generated and should not be in version control.
    
    * AVRO-831: Clean Directly from build.sh
    
    Python packaging is [moving away from embedding commands in
    setup.py][1]. It is difficult to maintain external commands in Python
    this way. Managing the dependencies needed to run commands from within
    setup.py is gnarly, because dependencies cannot be resolved so early.
    Furthermore, it's difficult to test code that happens at the point at
    which tests themselves are triggered, so that code needs to be very
    simple. A shell script is simple and more appropriate for this use case.
    
    At another time we should look to extract the lint command as well.
    
    [1][https://github.com/pypa/setuptools/issues/931]
---
 build.sh                                           |  14 +-
 doc/src/content/xdocs/gettingstartedpython.xml     |   5 +-
 lang/py/.gitignore                                 |   6 +
 lang/py/{src => }/avro/LICENSE                     |   0
 lang/py/{src => }/avro/NOTICE                      |   0
 lang/py/{src => }/avro/__init__.py                 |   0
 lang/py/{src => }/avro/constants.py                |   0
 lang/py/{src => }/avro/datafile.py                 |   0
 lang/py/{src => }/avro/io.py                       |   0
 lang/py/{src => }/avro/ipc.py                      |  20 +-
 lang/py/{src => }/avro/protocol.py                 |   0
 lang/py/{src => }/avro/schema.py                   |   0
 lang/py/{ => avro}/test/__init__.py                |   0
 lang/py/{ => avro}/test/av_bench.py                |   0
 lang/py/{ => avro}/test/gen_interop_data.py        |  28 +--
 lang/py/{ => avro}/test/mock_tether_parent.py      |   1 -
 lang/py/{ => avro}/test/sample_http_client.py      |   0
 lang/py/{ => avro}/test/sample_http_server.py      |   0
 lang/py/{ => avro}/test/test_datafile.py           |   1 -
 lang/py/{ => avro}/test/test_datafile_interop.py   |  13 +-
 lang/py/{ => avro}/test/test_io.py                 |   1 -
 lang/py/{ => avro}/test/test_ipc.py                |   1 -
 lang/py/{ => avro}/test/test_protocol.py           |   0
 lang/py/{ => avro}/test/test_schema.py             |   1 -
 lang/py/{ => avro}/test/test_script.py             |   2 +-
 lang/py/{ => avro}/test/test_tether_task.py        |   9 +-
 lang/py/{ => avro}/test/test_tether_task_runner.py |  13 +-
 lang/py/{ => avro}/test/test_tether_word_count.py  |  34 +++-
 lang/py/{ => avro}/test/txsample_http_client.py    |   0
 lang/py/{ => avro}/test/txsample_http_server.py    |   0
 lang/py/{ => avro}/test/word_count_task.py         |   0
 lang/py/{src => }/avro/tether/__init__.py          |   0
 lang/py/{src => }/avro/tether/tether_task.py       |   0
 .../py/{src => }/avro/tether/tether_task_runner.py |   0
 lang/py/{src => }/avro/tether/util.py              |   0
 lang/py/{src => }/avro/timezones.py                |   0
 lang/py/{src => }/avro/tool.py                     |   0
 lang/py/{src => }/avro/txipc.py                    |   0
 lang/py/build.sh                                   |  58 ++++--
 lang/py/ivy.xml                                    |  24 ---
 lang/py/ivysettings.xml                            |  30 ---
 lang/py/lib/pyAntTasks-1.3-LICENSE.txt             | 202 ---------------------
 lang/py/lib/pyAntTasks-1.3.jar                     | Bin 18788 -> 0 bytes
 lang/py/scripts/avro                               |  24 ++-
 lang/py/setup.cfg                                  |  48 +++++
 lang/py/setup.py                                   | 129 +++++++++----
 lang/py/test/set_avro_test_path.py                 |  44 -----
 share/test/interop/bin/test_rpc_interop.sh         |   6 +-
 48 files changed, 294 insertions(+), 420 deletions(-)

diff --git a/build.sh b/build.sh
index 838a5c4..8dadeb6 100755
--- a/build.sh
+++ b/build.sh
@@ -60,7 +60,7 @@ do
       (cd lang/php; ./build.sh test)
       (cd lang/perl; ./build.sh test)
 
-      (cd lang/py; ant interop-data-generate)
+      (cd lang/py; ./build.sh interop-data-generate)
       (cd lang/py3; python3 setup.py generate_interop_data \
         --schema-file=../../share/test/schemas/interop.avsc --output-path=../../build/interop/data)
       (cd lang/c; ./build.sh interop-data-generate)
@@ -72,7 +72,7 @@ do
 
       # run interop data tests
       (cd lang/java/ipc; mvn -B test -P interop-data-test)
-      (cd lang/py; ant interop-data-test)
+      (cd lang/py; ./build.sh interop-data-test)
       (cd lang/py3; python3 setup.py test --test-suite avro.tests.test_datafile_interop.TestDataFileInterop)
       (cd lang/c; ./build.sh interop-data-test)
       #(cd lang/c++; make interop-data-test)
@@ -122,17 +122,11 @@ do
 
       (cd lang/py; ./build.sh dist)
       (cd lang/py3; ./build.sh dist)
-
       (cd lang/c; ./build.sh dist)
-
       (cd lang/c++; ./build.sh dist)
-
       (cd lang/csharp; ./build.sh dist)
-
       (cd lang/js; ./build.sh dist)
-
       (cd lang/ruby; ./build.sh dist)
-
       (cd lang/php; ./build.sh dist)
 
       mkdir -p dist/perl
@@ -178,7 +172,7 @@ do
       rm -rf lang/java/*/userlogs/
       rm -rf lang/java/*/dependency-reduced-pom.xml
 
-      (cd lang/py; ant clean)
+      (cd lang/py; ./build.sh clean)
       rm -rf lang/py/userlogs/
 
       (cd lang/py3; python3 setup.py clean)
@@ -213,7 +207,7 @@ do
       rm -rf lang/java/*/userlogs/
       rm -rf lang/java/*/dependency-reduced-pom.xml
 
-      (cd lang/py; ant clean)
+      (cd lang/py; ./build.sh clean)
       rm -rf lang/py/userlogs/
 
       (cd lang/py3; python3 setup.py clean)
diff --git a/doc/src/content/xdocs/gettingstartedpython.xml b/doc/src/content/xdocs/gettingstartedpython.xml
index fe74e69..d29adb5 100644
--- a/doc/src/content/xdocs/gettingstartedpython.xml
+++ b/doc/src/content/xdocs/gettingstartedpython.xml
@@ -48,7 +48,7 @@
       <source>
 $ tar xvf avro-&AvroVersion;.tar.gz
 $ cd avro-&AvroVersion;
-$ sudo python setup.py install
+$ python setup.py install
 $ python
 >>> import avro # should not raise ImportError
       </source>
@@ -58,8 +58,7 @@ $ python
       </p>
       <source>
 $ cd lang/py/
-$ ant
-$ sudo python setup.py install
+$ python setup.py install
 $ python
 >>> import avro # should not raise ImportError
       </source>
diff --git a/lang/py/.gitignore b/lang/py/.gitignore
index ef29019..2eaad89 100644
--- a/lang/py/.gitignore
+++ b/lang/py/.gitignore
@@ -3,3 +3,9 @@
 build/
 lib/
 userlogs/
+avro/HandshakeRequest.avsc
+avro/HandshakeResponse.avsc
+avro/VERSION.txt
+avro/interop.avsc
+avro/tether/InputProtocol.avpr
+avro/tether/OutputProtocol.avpr
diff --git a/lang/py/src/avro/LICENSE b/lang/py/avro/LICENSE
similarity index 100%
rename from lang/py/src/avro/LICENSE
rename to lang/py/avro/LICENSE
diff --git a/lang/py/src/avro/NOTICE b/lang/py/avro/NOTICE
similarity index 100%
rename from lang/py/src/avro/NOTICE
rename to lang/py/avro/NOTICE
diff --git a/lang/py/src/avro/__init__.py b/lang/py/avro/__init__.py
similarity index 100%
rename from lang/py/src/avro/__init__.py
rename to lang/py/avro/__init__.py
diff --git a/lang/py/src/avro/constants.py b/lang/py/avro/constants.py
similarity index 100%
rename from lang/py/src/avro/constants.py
rename to lang/py/avro/constants.py
diff --git a/lang/py/src/avro/datafile.py b/lang/py/avro/datafile.py
similarity index 100%
rename from lang/py/src/avro/datafile.py
rename to lang/py/avro/datafile.py
diff --git a/lang/py/src/avro/io.py b/lang/py/avro/io.py
similarity index 100%
rename from lang/py/src/avro/io.py
rename to lang/py/avro/io.py
diff --git a/lang/py/src/avro/ipc.py b/lang/py/avro/ipc.py
similarity index 97%
rename from lang/py/src/avro/ipc.py
rename to lang/py/avro/ipc.py
index e21245e..17ad175 100644
--- a/lang/py/src/avro/ipc.py
+++ b/lang/py/avro/ipc.py
@@ -23,22 +23,22 @@ from __future__ import absolute_import, division, print_function
 
 import httplib
 import io
+import os
 
 import avro.io
 from avro import protocol, schema
 
-#
-# Constants
-#
 
-# Handshake schema is pulled in during build
-HANDSHAKE_REQUEST_SCHEMA = schema.parse("""
-@HANDSHAKE_REQUEST_SCHEMA@
-""")
+def _load(name):
+  dir_path = os.path.dirname(__file__)
+  rsrc_path = os.path.join(dir_path, name)
+  with open(rsrc_path, 'rb') as f:
+    return f.read()
 
-HANDSHAKE_RESPONSE_SCHEMA = schema.parse("""
-@HANDSHAKE_RESPONSE_SCHEMA@
-""")
+HANDSHAKE_REQUEST_SCHEMA_JSON = _load('HandshakeRequest.avsc')
+HANDSHAKE_RESPONSE_SCHEMA_JSON = _load('HandshakeResponse.avsc')
+HANDSHAKE_REQUEST_SCHEMA = schema.parse(HANDSHAKE_REQUEST_SCHEMA_JSON)
+HANDSHAKE_RESPONSE_SCHEMA = schema.parse(HANDSHAKE_RESPONSE_SCHEMA_JSON)
 
 HANDSHAKE_REQUESTOR_WRITER = avro.io.DatumWriter(HANDSHAKE_REQUEST_SCHEMA)
 HANDSHAKE_REQUESTOR_READER = avro.io.DatumReader(HANDSHAKE_RESPONSE_SCHEMA)
diff --git a/lang/py/src/avro/protocol.py b/lang/py/avro/protocol.py
similarity index 100%
rename from lang/py/src/avro/protocol.py
rename to lang/py/avro/protocol.py
diff --git a/lang/py/src/avro/schema.py b/lang/py/avro/schema.py
similarity index 100%
rename from lang/py/src/avro/schema.py
rename to lang/py/avro/schema.py
diff --git a/lang/py/test/__init__.py b/lang/py/avro/test/__init__.py
similarity index 100%
rename from lang/py/test/__init__.py
rename to lang/py/avro/test/__init__.py
diff --git a/lang/py/test/av_bench.py b/lang/py/avro/test/av_bench.py
similarity index 100%
rename from lang/py/test/av_bench.py
rename to lang/py/avro/test/av_bench.py
diff --git a/lang/py/test/gen_interop_data.py b/lang/py/avro/test/gen_interop_data.py
similarity index 76%
rename from lang/py/test/gen_interop_data.py
rename to lang/py/avro/test/gen_interop_data.py
index 13bf86c..3fe372e 100644
--- a/lang/py/test/gen_interop_data.py
+++ b/lang/py/avro/test/gen_interop_data.py
@@ -23,7 +23,9 @@ from __future__ import absolute_import, division, print_function
 import os
 import sys
 
-from avro import datafile, io, schema
+import avro.datafile
+import avro.io
+import avro.schema
 
 CODECS_TO_VALIDATE = ('null', 'deflate')
 
@@ -41,7 +43,7 @@ except ImportError:
 DATUM = {
   'intField': 12,
   'longField': 15234324,
-  'stringField': unicode('hey'),
+  'stringField': 'hey',
   'boolField': True,
   'floatField': 1234.0,
   'doubleField': -1234.0,
@@ -55,16 +57,18 @@ DATUM = {
   'recordField': {'label': 'blah', 'children': [{'label': 'inner', 'children': []}]},
 }
 
-if __name__ == "__main__":
+def generate(schema_path, output_path):
   for codec in CODECS_TO_VALIDATE:
-    interop_schema = schema.parse(open(sys.argv[1], 'r').read())
-    filename = sys.argv[2]
+    with open(schema_path, 'rb') as schema_file:
+      interop_schema = avro.schema.parse(schema_file.read())
+    filename = output_path
     if codec != 'null':
-      base, ext = os.path.splitext(filename)
+      base, ext = os.path.splitext(output_path)
       filename = base + "_" + codec + ext
-    writer = open(filename, 'wb')
-    datum_writer = io.DatumWriter()
-    # NB: not using compression
-    dfw = datafile.DataFileWriter(writer, datum_writer, interop_schema, codec=codec)
-    dfw.append(DATUM)
-    dfw.close()
+    with avro.datafile.DataFileWriter(open(filename, 'wb'), avro.io.DatumWriter(),
+                                      interop_schema, codec=codec) as dfw:
+      # NB: not using compression
+      dfw.append(DATUM)
+
+if __name__ == "__main__":
+  generate(sys.argv[1], sys.argv[2])
diff --git a/lang/py/test/mock_tether_parent.py b/lang/py/avro/test/mock_tether_parent.py
similarity index 99%
rename from lang/py/test/mock_tether_parent.py
rename to lang/py/avro/test/mock_tether_parent.py
index 88d84dd..d490313 100644
--- a/lang/py/test/mock_tether_parent.py
+++ b/lang/py/avro/test/mock_tether_parent.py
@@ -25,7 +25,6 @@ from BaseHTTPServer import BaseHTTPRequestHandler, HTTPServer
 
 import avro.tether.tether_task
 import avro.tether.util
-import set_avro_test_path
 from avro import ipc, protocol
 
 SERVER_ADDRESS = ('localhost', avro.tether.util.find_port())
diff --git a/lang/py/test/sample_http_client.py b/lang/py/avro/test/sample_http_client.py
similarity index 100%
rename from lang/py/test/sample_http_client.py
rename to lang/py/avro/test/sample_http_client.py
diff --git a/lang/py/test/sample_http_server.py b/lang/py/avro/test/sample_http_server.py
similarity index 100%
rename from lang/py/test/sample_http_server.py
rename to lang/py/avro/test/sample_http_server.py
diff --git a/lang/py/test/test_datafile.py b/lang/py/avro/test/test_datafile.py
similarity index 99%
rename from lang/py/test/test_datafile.py
rename to lang/py/avro/test/test_datafile.py
index b020222..68491e7 100644
--- a/lang/py/test/test_datafile.py
+++ b/lang/py/avro/test/test_datafile.py
@@ -22,7 +22,6 @@ from __future__ import absolute_import, division, print_function
 import os
 import unittest
 
-import set_avro_test_path
 from avro import datafile, io, schema
 
 SCHEMAS_TO_VALIDATE = (
diff --git a/lang/py/test/test_datafile_interop.py b/lang/py/avro/test/test_datafile_interop.py
similarity index 79%
rename from lang/py/test/test_datafile_interop.py
rename to lang/py/avro/test/test_datafile_interop.py
index 329b9a1..7c79502 100644
--- a/lang/py/test/test_datafile_interop.py
+++ b/lang/py/avro/test/test_datafile_interop.py
@@ -22,24 +22,30 @@ from __future__ import absolute_import, division, print_function
 import os
 import unittest
 
-import set_avro_test_path
+import avro
 from avro import datafile, io
 
+_INTEROP_DATA_DIR = os.path.join(os.path.dirname(avro.__file__), 'test', 'interop', 'data')
 
+@unittest.skipUnless(os.path.exists(_INTEROP_DATA_DIR),
+                     "{} does not exist".format(_INTEROP_DATA_DIR))
 class TestDataFileInterop(unittest.TestCase):
   def test_interop(self):
+    ran = False
     print()
     print('TEST INTEROP')
     print('============')
     print()
-    for f in os.listdir('@INTEROP_DATA_DIR@'):
+    for f in os.listdir(_INTEROP_DATA_DIR):
+      ran = True
+
       base_ext = os.path.splitext(os.path.basename(f))[0].split('_', 1)
       if len(base_ext) < 2 or base_ext[1] in datafile.VALID_CODECS:
         print('READING %s' % f)
         print('')
 
         # read data in binary from file
-        reader = open(os.path.join('@INTEROP_DATA_DIR@', f), 'rb')
+        reader = open(os.path.join(_INTEROP_DATA_DIR, f), 'rb')
         datum_reader = io.DatumReader()
         dfr = datafile.DataFileReader(reader, datum_reader)
         i = 0
@@ -49,6 +55,7 @@ class TestDataFileInterop(unittest.TestCase):
       else:
         print('SKIPPING %s due to an unsupported codec' % f)
         print('')
+    self.assertTrue(ran, "Didn't find any interop data files to test")
 
 if __name__ == '__main__':
   unittest.main()
diff --git a/lang/py/test/test_io.py b/lang/py/avro/test/test_io.py
similarity index 99%
rename from lang/py/test/test_io.py
rename to lang/py/avro/test/test_io.py
index 93fa2b1..b4bd30a 100644
--- a/lang/py/test/test_io.py
+++ b/lang/py/avro/test/test_io.py
@@ -26,7 +26,6 @@ from binascii import hexlify
 from decimal import Decimal
 
 import avro.io
-import set_avro_test_path
 from avro import schema, timezones
 
 SCHEMAS_TO_VALIDATE = (
diff --git a/lang/py/test/test_ipc.py b/lang/py/avro/test/test_ipc.py
similarity index 98%
rename from lang/py/test/test_ipc.py
rename to lang/py/avro/test/test_ipc.py
index bc9bd21..6be7d05 100644
--- a/lang/py/test/test_ipc.py
+++ b/lang/py/avro/test/test_ipc.py
@@ -26,7 +26,6 @@ from __future__ import absolute_import, division, print_function
 
 import unittest
 
-import set_avro_test_path
 # This test does import this code, to make sure it at least passes
 # compilation.
 from avro import ipc
diff --git a/lang/py/test/test_protocol.py b/lang/py/avro/test/test_protocol.py
similarity index 100%
rename from lang/py/test/test_protocol.py
rename to lang/py/avro/test/test_protocol.py
diff --git a/lang/py/test/test_schema.py b/lang/py/avro/test/test_schema.py
similarity index 99%
rename from lang/py/test/test_schema.py
rename to lang/py/avro/test/test_schema.py
index 3b0b4e2..542e511 100644
--- a/lang/py/test/test_schema.py
+++ b/lang/py/avro/test/test_schema.py
@@ -25,7 +25,6 @@ import json
 import unittest
 import warnings
 
-import set_avro_test_path
 from avro import schema
 
 
diff --git a/lang/py/test/test_script.py b/lang/py/avro/test/test_script.py
similarity index 99%
rename from lang/py/test/test_script.py
rename to lang/py/avro/test/test_script.py
index ed66e2d..a92754f 100644
--- a/lang/py/test/test_script.py
+++ b/lang/py/avro/test/test_script.py
@@ -63,7 +63,7 @@ def looney_records():
     for f, l, t in LOONIES:
         yield {"first": f, "last" : l, "type" : t}
 
-SCRIPT = join(dirname(__file__), "..", "scripts", "avro")
+SCRIPT = join(dirname(dirname(dirname(__file__))), "scripts", "avro")
 
 _JSON_PRETTY = '''{
     "type": "duck",
diff --git a/lang/py/test/test_tether_task.py b/lang/py/avro/test/test_tether_task.py
similarity index 95%
rename from lang/py/test/test_tether_task.py
rename to lang/py/avro/test/test_tether_task.py
index 7b1bea0..78de323 100644
--- a/lang/py/test/test_tether_task.py
+++ b/lang/py/avro/test/test_tether_task.py
@@ -27,12 +27,11 @@ import time
 import unittest
 
 import avro.io
+import avro.test.mock_tether_parent
+import avro.test.word_count_task
 import avro.tether.tether_task
 import avro.tether.util
-import mock_tether_parent
-import set_avro_test_path
 from avro import schema, tether
-from word_count_task import WordCountTask
 
 
 class TestTetherTask(unittest.TestCase):
@@ -44,7 +43,7 @@ class TestTetherTask(unittest.TestCase):
     Test that the thether_task is working. We run the mock_tether_parent in a separate
     subprocess
     """
-    task=WordCountTask()
+    task=avro.test.word_count_task.WordCountTask()
 
     proc=None
     try:
@@ -54,7 +53,7 @@ class TestTetherTask(unittest.TestCase):
       env["PYTHONPATH"]=':'.join(sys.path)
       server_port = avro.tether.util.find_port()
 
-      pyfile=mock_tether_parent.__file__
+      pyfile=avro.test.mock_tether_parent.__file__
       proc=subprocess.Popen(["python", pyfile,"start_server","{0}".format(server_port)])
       input_port = avro.tether.util.find_port()
 
diff --git a/lang/py/test/test_tether_task_runner.py b/lang/py/avro/test/test_tether_task_runner.py
similarity index 94%
rename from lang/py/test/test_tether_task_runner.py
rename to lang/py/avro/test/test_tether_task_runner.py
index 741f626..8a79272 100644
--- a/lang/py/test/test_tether_task_runner.py
+++ b/lang/py/avro/test/test_tether_task_runner.py
@@ -31,9 +31,8 @@ import avro.io
 import avro.tether.tether_task
 import avro.tether.tether_task_runner
 import avro.tether.util
-import mock_tether_parent
-import set_avro_test_path
-from word_count_task import WordCountTask
+import avro.test.mock_tether_parent
+import avro.test.word_count_task
 
 
 class TestTetherTaskRunner(unittest.TestCase):
@@ -50,7 +49,7 @@ class TestTetherTaskRunner(unittest.TestCase):
       env["PYTHONPATH"]=':'.join(sys.path)
       parent_port = avro.tether.util.find_port()
 
-      pyfile=mock_tether_parent.__file__
+      pyfile=avro.test.mock_tether_parent.__file__
       proc=subprocess.Popen(["python", pyfile,"start_server","{0}".format(parent_port)])
       input_port = avro.tether.util.find_port()
 
@@ -59,7 +58,7 @@ class TestTetherTaskRunner(unittest.TestCase):
       # so we give the subprocess time to start up
       time.sleep(1)
 
-      runner = avro.tether.tether_task_runner.TaskRunner(WordCountTask())
+      runner = avro.tether.tether_task_runner.TaskRunner(avro.test.word_count_task.WordCountTask())
 
       runner.start(outputport=parent_port,join=False)
 
@@ -154,7 +153,7 @@ class TestTetherTaskRunner(unittest.TestCase):
       env["PYTHONPATH"]=':'.join(sys.path)
       parent_port = avro.tether.util.find_port()
 
-      pyfile=mock_tether_parent.__file__
+      pyfile=avro.test.mock_tether_parent.__file__
       proc=subprocess.Popen(["python", pyfile,"start_server","{0}".format(parent_port)])
 
       #Possible race condition? when we start tether_task_runner it will call
@@ -167,7 +166,7 @@ class TestTetherTaskRunner(unittest.TestCase):
       env={"AVRO_TETHER_OUTPUT_PORT":"{0}".format(parent_port)}
       env["PYTHONPATH"]=':'.join(sys.path)
 
-      runnerproc = subprocess.Popen(["python", avro.tether.tether_task_runner.__file__, "word_count_task.WordCountTask"],env=env)
+      runnerproc = subprocess.Popen(["python", avro.tether.tether_task_runner.__file__, "avro.test.word_count_task.WordCountTask"], env=env)
 
       #possible race condition wait for the process to start
       time.sleep(1)
diff --git a/lang/py/test/test_tether_word_count.py b/lang/py/avro/test/test_tether_word_count.py
similarity index 78%
rename from lang/py/test/test_tether_word_count.py
rename to lang/py/avro/test/test_tether_word_count.py
index d368d6f..acd0b83 100644
--- a/lang/py/test/test_tether_word_count.py
+++ b/lang/py/avro/test/test_tether_word_count.py
@@ -20,7 +20,9 @@
 from __future__ import absolute_import, division, print_function
 
 import collections
+import distutils.spawn
 import os
+import platform
 import shutil
 import subprocess
 import sys
@@ -32,11 +34,17 @@ import avro.datafile
 import avro.io
 import avro.schema
 import avro.tether.tether_task_runner
-import set_avro_test_path
 
-_TOP_DIR = """@TOPDIR@"""
-_AVRO_VERSION = """@AVRO_VERSION@"""
-_JAR_PATH = os.path.abspath(os.path.join(_TOP_DIR, "..", "java", "tools", "target", "avro-tools-{}.jar".format(_AVRO_VERSION)))
+_AVRO_DIR = os.path.abspath(os.path.dirname(avro.__file__))
+
+def _version():
+  with open(os.path.join(_AVRO_DIR, 'VERSION.txt')) as v:
+    # Convert it back to the java version
+    return v.read().strip().replace('+', '-')
+
+_AVRO_VERSION = _version()
+_JAR_PATH = os.path.join(os.path.dirname(os.path.dirname(_AVRO_DIR)),
+    "java", "tools", "target", "avro-tools-{}.jar".format(_AVRO_VERSION))
 
 _LINES = ("the quick brown fox jumps over the lazy dog",
           "the cow jumps over the moon",
@@ -56,6 +64,24 @@ _PYTHON_PATH = os.pathsep.join([os.path.dirname(os.path.dirname(avro.__file__)),
                                 os.path.dirname(__file__)])
 
 
+def _has_java():
+  """Detect if this system has a usable java installed.
+
+  On most systems, this is just checking if `java` is in the PATH.
+
+  But macos always has a /usr/bin/java, which does not mean java is installed. If you invoke java on macos and java is not installed, macos will spawn a popup telling you how to install java. This code does additional work around that to be completely automatic.
+  """
+  if platform.system() == "Darwin":
+    try:
+      output = subprocess.check_output("/usr/libexec/java_home", stderr=subprocess.STDOUT)
+    except subprocess.CalledProcessError as e:
+      output = e.output
+    return ("No Java runtime present" not in output)
+  return bool(distutils.spawn.find_executable("java"))
+
+
+@unittest.skipUnless(_has_java(), "No Java runtime present")
+@unittest.skipUnless(os.path.exists(_JAR_PATH), "{} not found".format(_JAR_PATH))
 class TestTetherWordCount(unittest.TestCase):
   """unittest for a python tethered map-reduce job."""
 
diff --git a/lang/py/test/txsample_http_client.py b/lang/py/avro/test/txsample_http_client.py
similarity index 100%
rename from lang/py/test/txsample_http_client.py
rename to lang/py/avro/test/txsample_http_client.py
diff --git a/lang/py/test/txsample_http_server.py b/lang/py/avro/test/txsample_http_server.py
similarity index 100%
rename from lang/py/test/txsample_http_server.py
rename to lang/py/avro/test/txsample_http_server.py
diff --git a/lang/py/test/word_count_task.py b/lang/py/avro/test/word_count_task.py
similarity index 100%
rename from lang/py/test/word_count_task.py
rename to lang/py/avro/test/word_count_task.py
diff --git a/lang/py/src/avro/tether/__init__.py b/lang/py/avro/tether/__init__.py
similarity index 100%
rename from lang/py/src/avro/tether/__init__.py
rename to lang/py/avro/tether/__init__.py
diff --git a/lang/py/src/avro/tether/tether_task.py b/lang/py/avro/tether/tether_task.py
similarity index 100%
rename from lang/py/src/avro/tether/tether_task.py
rename to lang/py/avro/tether/tether_task.py
diff --git a/lang/py/src/avro/tether/tether_task_runner.py b/lang/py/avro/tether/tether_task_runner.py
similarity index 100%
rename from lang/py/src/avro/tether/tether_task_runner.py
rename to lang/py/avro/tether/tether_task_runner.py
diff --git a/lang/py/src/avro/tether/util.py b/lang/py/avro/tether/util.py
similarity index 100%
rename from lang/py/src/avro/tether/util.py
rename to lang/py/avro/tether/util.py
diff --git a/lang/py/src/avro/timezones.py b/lang/py/avro/timezones.py
similarity index 100%
rename from lang/py/src/avro/timezones.py
rename to lang/py/avro/timezones.py
diff --git a/lang/py/src/avro/tool.py b/lang/py/avro/tool.py
similarity index 100%
rename from lang/py/src/avro/tool.py
rename to lang/py/avro/tool.py
diff --git a/lang/py/src/avro/txipc.py b/lang/py/avro/txipc.py
similarity index 100%
rename from lang/py/src/avro/txipc.py
rename to lang/py/avro/txipc.py
diff --git a/lang/py/build.sh b/lang/py/build.sh
index 33040d8..2dc6bd0 100755
--- a/lang/py/build.sh
+++ b/lang/py/build.sh
@@ -18,31 +18,53 @@
 set -e
 
 usage() {
-  echo "Usage: $0 {lint|test|dist|clean}"
+  echo "Usage: $0 {clean|dist|interop-data-generate|interop-data-test|lint|test}"
   exit 1
 }
 
+clean() {
+  git clean -xdf '*.avpr' \
+                 '*.avsc' \
+                 '*.egg-info' \
+                 '*.py[co]' \
+                 'VERSION.txt' \
+                 '__pycache__' \
+                 'avro/test/interop' \
+                 'dist' \
+                 'userlogs'
+}
+
+dist() {
+  ./setup.py dist
+}
+
+interop-data-generate() {
+  ./setup.py generate_interop_data
+}
+
+interop-data-test() {
+  python -m unittest avro.test.test_datafile_interop
+}
+
+lint() {
+  ./setup.py isort lint
+}
+
+test_() {
+  ./setup.py test
+}
+
 main() {
-  local target
   (( $# )) || usage
   for target; do
     case "$target" in
-      lint)
-        ./setup.py isort lint
-        ;;
-      test)
-        ant test
-        ;;
-      dist)
-        ant dist
-        ;;
-      clean)
-        ant clean
-        rm -rf userlogs/
-        ;;
-      *)
-        usage
-        ;;
+      clean) clean;;
+      dist) dist;;
+      interop-data-generate) interop-data-generate;;
+      interop-data-test) interop-data-test;;
+      lint) lint;;
+      test) test_;;
+      *) usage;;
     esac
   done
 }
diff --git a/lang/py/ivy.xml b/lang/py/ivy.xml
deleted file mode 100644
index 1926b19..0000000
--- a/lang/py/ivy.xml
+++ /dev/null
@@ -1,24 +0,0 @@
-<!--
-   Licensed to the Apache Software Foundation (ASF) under one or more
-   contributor license agreements.  See the NOTICE file distributed with
-   this work for additional information regarding copyright ownership.
-   The ASF licenses this file to You under the Apache License, Version 2.0
-   (the "License"); you may not use this file except in compliance with
-   the License.  You may obtain a copy of the License at
-
-       https://www.apache.org/licenses/LICENSE-2.0
-
-   Unless required by applicable law or agreed to in writing, software
-   distributed under the License is distributed on an "AS IS" BASIS,
-   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-   See the License for the specific language governing permissions and
-   limitations under the License.
--->
-<ivy-module version="2.0">
-    <info organisation="org.apache.avro" module="python"/>
-    <configurations defaultconfmapping="default"/>
-    <dependencies>
-        <dependency org="org.apache.avro" name="avro-tools"
-                    rev="${avro.version}" transitive="false"/>
-    </dependencies>
-</ivy-module>
diff --git a/lang/py/ivysettings.xml b/lang/py/ivysettings.xml
deleted file mode 100644
index 0258c01..0000000
--- a/lang/py/ivysettings.xml
+++ /dev/null
@@ -1,30 +0,0 @@
-<!--
-   Licensed to the Apache Software Foundation (ASF) under one or more
-   contributor license agreements.  See the NOTICE file distributed with
-   this work for additional information regarding copyright ownership.
-   The ASF licenses this file to You under the Apache License, Version 2.0
-   (the "License"); you may not use this file except in compliance with
-   the License.  You may obtain a copy of the License at
-
-       https://www.apache.org/licenses/LICENSE-2.0
-
-   Unless required by applicable law or agreed to in writing, software
-   distributed under the License is distributed on an "AS IS" BASIS,
-   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-   See the License for the specific language governing permissions and
-   limitations under the License.
--->
-<ivysettings>
-  <settings defaultResolver="repos" />
-  <property name="m2-pattern" value="${user.home}/.m2/repository/[organisation]/[module]/[revision]/[module]-[revision](-[classifier]).[ext]" override="false" />
-  <resolvers>
-    <chain name="repos">
-      <ibiblio name="central" m2compatible="true"/>
-      <ibiblio name="apache-snapshots" m2compatible="true" root="https://repository.apache.org/content/groups/snapshots"/>
-      <filesystem name="local-maven2" m2compatible="true"> <!-- needed when building non-snapshot version for release -->
-        <artifact pattern="${m2-pattern}"/>
-        <ivy pattern="${m2-pattern}"/>
-      </filesystem>
-    </chain>
-  </resolvers>
-</ivysettings>
diff --git a/lang/py/lib/pyAntTasks-1.3-LICENSE.txt b/lang/py/lib/pyAntTasks-1.3-LICENSE.txt
deleted file mode 100644
index 62589ed..0000000
--- a/lang/py/lib/pyAntTasks-1.3-LICENSE.txt
+++ /dev/null
@@ -1,202 +0,0 @@
-
-                                 Apache License
-                           Version 2.0, January 2004
-                        https://www.apache.org/licenses/
-
-   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
-
-   1. Definitions.
-
-      "License" shall mean the terms and conditions for use, reproduction,
-      and distribution as defined by Sections 1 through 9 of this document.
-
-      "Licensor" shall mean the copyright owner or entity authorized by
-      the copyright owner that is granting the License.
-
-      "Legal Entity" shall mean the union of the acting entity and all
-      other entities that control, are controlled by, or are under common
-      control with that entity. For the purposes of this definition,
-      "control" means (i) the power, direct or indirect, to cause the
-      direction or management of such entity, whether by contract or
-      otherwise, or (ii) ownership of fifty percent (50%) or more of the
-      outstanding shares, or (iii) beneficial ownership of such entity.
-
-      "You" (or "Your") shall mean an individual or Legal Entity
-      exercising permissions granted by this License.
-
-      "Source" form shall mean the preferred form for making modifications,
-      including but not limited to software source code, documentation
-      source, and configuration files.
-
-      "Object" form shall mean any form resulting from mechanical
-      transformation or translation of a Source form, including but
-      not limited to compiled object code, generated documentation,
-      and conversions to other media types.
-
-      "Work" shall mean the work of authorship, whether in Source or
-      Object form, made available under the License, as indicated by a
-      copyright notice that is included in or attached to the work
-      (an example is provided in the Appendix below).
-
-      "Derivative Works" shall mean any work, whether in Source or Object
-      form, that is based on (or derived from) the Work and for which the
-      editorial revisions, annotations, elaborations, or other modifications
-      represent, as a whole, an original work of authorship. For the purposes
-      of this License, Derivative Works shall not include works that remain
-      separable from, or merely link (or bind by name) to the interfaces of,
-      the Work and Derivative Works thereof.
-
-      "Contribution" shall mean any work of authorship, including
-      the original version of the Work and any modifications or additions
-      to that Work or Derivative Works thereof, that is intentionally
-      submitted to Licensor for inclusion in the Work by the copyright owner
-      or by an individual or Legal Entity authorized to submit on behalf of
-      the copyright owner. For the purposes of this definition, "submitted"
-      means any form of electronic, verbal, or written communication sent
-      to the Licensor or its representatives, including but not limited to
-      communication on electronic mailing lists, source code control systems,
-      and issue tracking systems that are managed by, or on behalf of, the
-      Licensor for the purpose of discussing and improving the Work, but
-      excluding communication that is conspicuously marked or otherwise
-      designated in writing by the copyright owner as "Not a Contribution."
-
-      "Contributor" shall mean Licensor and any individual or Legal Entity
-      on behalf of whom a Contribution has been received by Licensor and
-      subsequently incorporated within the Work.
-
-   2. Grant of Copyright License. Subject to the terms and conditions of
-      this License, each Contributor hereby grants to You a perpetual,
-      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
-      copyright license to reproduce, prepare Derivative Works of,
-      publicly display, publicly perform, sublicense, and distribute the
-      Work and such Derivative Works in Source or Object form.
-
-   3. Grant of Patent License. Subject to the terms and conditions of
-      this License, each Contributor hereby grants to You a perpetual,
-      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
-      (except as stated in this section) patent license to make, have made,
-      use, offer to sell, sell, import, and otherwise transfer the Work,
-      where such license applies only to those patent claims licensable
-      by such Contributor that are necessarily infringed by their
-      Contribution(s) alone or by combination of their Contribution(s)
-      with the Work to which such Contribution(s) was submitted. If You
-      institute patent litigation against any entity (including a
-      cross-claim or counterclaim in a lawsuit) alleging that the Work
-      or a Contribution incorporated within the Work constitutes direct
-      or contributory patent infringement, then any patent licenses
-      granted to You under this License for that Work shall terminate
-      as of the date such litigation is filed.
-
-   4. Redistribution. You may reproduce and distribute copies of the
-      Work or Derivative Works thereof in any medium, with or without
-      modifications, and in Source or Object form, provided that You
-      meet the following conditions:
-
-      (a) You must give any other recipients of the Work or
-          Derivative Works a copy of this License; and
-
-      (b) You must cause any modified files to carry prominent notices
-          stating that You changed the files; and
-
-      (c) You must retain, in the Source form of any Derivative Works
-          that You distribute, all copyright, patent, trademark, and
-          attribution notices from the Source form of the Work,
-          excluding those notices that do not pertain to any part of
-          the Derivative Works; and
-
-      (d) If the Work includes a "NOTICE" text file as part of its
-          distribution, then any Derivative Works that You distribute must
-          include a readable copy of the attribution notices contained
-          within such NOTICE file, excluding those notices that do not
-          pertain to any part of the Derivative Works, in at least one
-          of the following places: within a NOTICE text file distributed
-          as part of the Derivative Works; within the Source form or
-          documentation, if provided along with the Derivative Works; or,
-          within a display generated by the Derivative Works, if and
-          wherever such third-party notices normally appear. The contents
-          of the NOTICE file are for informational purposes only and
-          do not modify the License. You may add Your own attribution
-          notices within Derivative Works that You distribute, alongside
-          or as an addendum to the NOTICE text from the Work, provided
-          that such additional attribution notices cannot be construed
-          as modifying the License.
-
-      You may add Your own copyright statement to Your modifications and
-      may provide additional or different license terms and conditions
-      for use, reproduction, or distribution of Your modifications, or
-      for any such Derivative Works as a whole, provided Your use,
-      reproduction, and distribution of the Work otherwise complies with
-      the conditions stated in this License.
-
-   5. Submission of Contributions. Unless You explicitly state otherwise,
-      any Contribution intentionally submitted for inclusion in the Work
-      by You to the Licensor shall be under the terms and conditions of
-      this License, without any additional terms or conditions.
-      Notwithstanding the above, nothing herein shall supersede or modify
-      the terms of any separate license agreement you may have executed
-      with Licensor regarding such Contributions.
-
-   6. Trademarks. This License does not grant permission to use the trade
-      names, trademarks, service marks, or product names of the Licensor,
-      except as required for reasonable and customary use in describing the
-      origin of the Work and reproducing the content of the NOTICE file.
-
-   7. Disclaimer of Warranty. Unless required by applicable law or
-      agreed to in writing, Licensor provides the Work (and each
-      Contributor provides its Contributions) on an "AS IS" BASIS,
-      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
-      implied, including, without limitation, any warranties or conditions
-      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
-      PARTICULAR PURPOSE. You are solely responsible for determining the
-      appropriateness of using or redistributing the Work and assume any
-      risks associated with Your exercise of permissions under this License.
-
-   8. Limitation of Liability. In no event and under no legal theory,
-      whether in tort (including negligence), contract, or otherwise,
-      unless required by applicable law (such as deliberate and grossly
-      negligent acts) or agreed to in writing, shall any Contributor be
-      liable to You for damages, including any direct, indirect, special,
-      incidental, or consequential damages of any character arising as a
-      result of this License or out of the use or inability to use the
-      Work (including but not limited to damages for loss of goodwill,
-      work stoppage, computer failure or malfunction, or any and all
-      other commercial damages or losses), even if such Contributor
-      has been advised of the possibility of such damages.
-
-   9. Accepting Warranty or Additional Liability. While redistributing
-      the Work or Derivative Works thereof, You may choose to offer,
-      and charge a fee for, acceptance of support, warranty, indemnity,
-      or other liability obligations and/or rights consistent with this
-      License. However, in accepting such obligations, You may act only
-      on Your own behalf and on Your sole responsibility, not on behalf
-      of any other Contributor, and only if You agree to indemnify,
-      defend, and hold each Contributor harmless for any liability
-      incurred by, or claims asserted against, such Contributor by reason
-      of your accepting any such warranty or additional liability.
-
-   END OF TERMS AND CONDITIONS
-
-   APPENDIX: How to apply the Apache License to your work.
-
-      To apply the Apache License to your work, attach the following
-      boilerplate notice, with the fields enclosed by brackets "[]"
-      replaced with your own identifying information. (Don't include
-      the brackets!)  The text should be enclosed in the appropriate
-      comment syntax for the file format. We also recommend that a
-      file or class name and description of purpose be included on the
-      same "printed page" as the copyright notice for easier
-      identification within third-party archives.
-
-   Copyright [yyyy] [name of copyright owner]
-
-   Licensed under the Apache License, Version 2.0 (the "License");
-   you may not use this file except in compliance with the License.
-   You may obtain a copy of the License at
-
-       https://www.apache.org/licenses/LICENSE-2.0
-
-   Unless required by applicable law or agreed to in writing, software
-   distributed under the License is distributed on an "AS IS" BASIS,
-   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-   See the License for the specific language governing permissions and
-   limitations under the License.
diff --git a/lang/py/lib/pyAntTasks-1.3.jar b/lang/py/lib/pyAntTasks-1.3.jar
deleted file mode 100644
index 53a7877..0000000
Binary files a/lang/py/lib/pyAntTasks-1.3.jar and /dev/null differ
diff --git a/lang/py/scripts/avro b/lang/py/scripts/avro
old mode 100644
new mode 100755
index 7488f70..e0c116b
--- a/lang/py/scripts/avro
+++ b/lang/py/scripts/avro
@@ -23,15 +23,25 @@ from __future__ import absolute_import, division, print_function
 import csv
 import json
 from functools import partial
+import os.path
 from itertools import ifilter, imap
-from os.path import splitext
 from sys import stdin, stdout
 
+import avro
 import avro.schema
 from avro.datafile import DataFileReader, DataFileWriter
 from avro.io import DatumReader, DatumWriter
 
 
+_AVRO_DIR = os.path.abspath(os.path.dirname(avro.__file__))
+
+def _version():
+  with open(os.path.join(_AVRO_DIR, 'VERSION.txt')) as v:
+    return v.read()
+
+_AVRO_VERSION = _version()
+
+
 class AvroError(Exception):
     pass
 
@@ -118,7 +128,7 @@ def cat(opts, args):
     for filename in args:
         try:
             fo = open(filename, "rb")
-        except (OSError, IOError), e:
+        except (OSError, IOError) as e:
             raise AvroError("Can't open %s - %s" % (filename, e))
 
         avro = DataFileReader(fo, DatumReader())
@@ -175,7 +185,7 @@ def guess_input_type(files):
     if not files:
         return None
 
-    ext = splitext(files[0])[1].lower()
+    ext = os.path.splitext(files[0])[1].lower()
     if ext in (".json", ".js"):
         return "json"
     elif ext in (".csv",):
@@ -194,7 +204,7 @@ def write(opts, files):
     try:
         schema = avro.schema.parse(open(opts.schema, "rb").read())
         out = _open(opts.output, "wb")
-    except (IOError, OSError), e:
+    except (IOError, OSError) as e:
         raise AvroError("Can't open file - %s" % e)
 
     writer = DataFileWriter(out, DatumWriter(), schema)
@@ -214,7 +224,7 @@ def main(argv=None):
     argv = argv or sys.argv
 
     parser = OptionParser(description="Display/write for Avro files",
-                      version="@AVRO_VERSION@",
+                      version=_AVRO_VERSION,
                       usage="usage: %prog cat|write [options] FILE [FILE...]")
     # cat options
 
@@ -257,9 +267,9 @@ def main(argv=None):
             write(opts, args)
         else:
             raise AvroError("Unknown command - %s" % command)
-    except AvroError, e:
+    except AvroError as e:
         parser.error("%s" % e) # Will exit
-    except Exception, e:
+    except Exception as e:
         raise SystemExit("panic: %s" % e)
 
 if __name__ == "__main__":
diff --git a/lang/py/setup.cfg b/lang/py/setup.cfg
index 2b53378..c1f9fc0 100644
--- a/lang/py/setup.cfg
+++ b/lang/py/setup.cfg
@@ -14,6 +14,54 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 
+##
+# https://setuptools.readthedocs.io/en/latest/setuptools.html#configuring-setup-using-setup-cfg-files
+[metadata]
+name = avro
+version = file: avro/VERSION.txt
+description = Avro is a serialization and RPC framework.
+long_description = file: README.txt
+keywords =
+    avro
+    serialization
+    rpc
+author = Apache Avro
+author_email = dev@avro.apache.org
+url = https://avro.apache.org/
+license = Apache License 2.0
+classifiers =
+    License :: OSI Approved :: Apache Software License
+    Programming Language :: Python :: 2.7 :: Only
+
+[options]
+packages = avro
+package_dir =
+    avro = avro
+include_package_data = true
+setup_requires =
+  isort
+  pycodestyle
+install_requires =
+zip_safe = true
+scripts =
+    scripts/avro
+python_requires = <3.0,>=2.7
+
+[options.package_data]
+avro =
+    HandshakeRequest.avsc
+    HandshakeResponse.avsc
+    share/VERSION.txt
+    LICENSE
+    NOTICE
+
+[options.extras_require]
+snappy = python-snappy
+zstandard = zstandard
+
+[aliases]
+dist = sdist --dist-dir ../../dist/py
+
 [isort]
 line_length = 150
 known_third_party=zope
diff --git a/lang/py/setup.py b/lang/py/setup.py
index b978092..fc662a3 100755
--- a/lang/py/setup.py
+++ b/lang/py/setup.py
@@ -27,11 +27,94 @@ import subprocess
 
 import setuptools
 
+_HERE = os.path.dirname(os.path.abspath(__file__))
+_AVRO_DIR = os.path.join(_HERE, 'avro')
+_VERSION_FILE_NAME = 'VERSION.txt'
+
+
+def _is_distribution():
+    """Tests whether setup.py is invoked from a distribution.
+
+    Returns:
+        True if setup.py runs from a distribution.
+        False otherwise, ie. if setup.py runs from a version control work tree.
+    """
+    # If a file PKG-INFO exists as a sibling of setup.py,
+    # assume we are running as source distribution:
+    return os.path.exists(os.path.join(_HERE, 'PKG-INFO'))
+
+
+def _generate_package_data():
+    """Generate package data.
+
+    This data will already exist in a distribution package,
+    so this function only runs for local version control work tree.
+    """
+    distutils.log.info('Generating package data')
+
+    # Avro top-level source directory:
+    root_dir = os.path.dirname(os.path.dirname(_HERE))
+    share_dir = os.path.join(root_dir, 'share')
+
+    # Create a PEP440 compliant version file.
+    version_file_path = os.path.join(share_dir, _VERSION_FILE_NAME)
+    with open(version_file_path, 'rb') as vin:
+        version = vin.read().replace('-', '+')
+    with open(os.path.join(_AVRO_DIR, _VERSION_FILE_NAME), 'wb') as vout:
+        vout.write(version)
+
+    avro_schemas_dir = os.path.join(share_dir, 'schemas', 'org', 'apache', 'avro')
+    ipc_dir = os.path.join(avro_schemas_dir, 'ipc')
+    tether_dir = os.path.join(avro_schemas_dir, 'mapred', 'tether')
+
+    # Copy necessary avsc files:
+    avsc_files = (
+        ((share_dir, 'test', 'schemas', 'interop.avsc'), ('',)),
+        ((ipc_dir, 'HandshakeRequest.avsc'), ('',)),
+        ((ipc_dir, 'HandshakeResponse.avsc'), ('',)),
+        ((tether_dir, 'InputProtocol.avpr'), ('tether',)),
+        ((tether_dir, 'OutputProtocol.avpr'), ('tether',)),
+    )
+
+    for src, dst in avsc_files:
+        src = os.path.join(*src)
+        dst = os.path.join(_AVRO_DIR, *dst)
+        distutils.file_util.copy_file(src, dst)
+
+
+class GenerateInteropDataCommand(setuptools.Command):
+    """A command to generate Avro files for data interop test."""
+
+    user_options = [
+      ('schema-file=', None, 'path to input Avro schema file'),
+      ('output-path=', None, 'path to output Avro data files'),
+    ]
+
+    def initialize_options(self):
+      self.schema_file = os.path.join(_AVRO_DIR, 'interop.avsc')
+      self.output_path = os.path.join(_AVRO_DIR, 'test', 'interop', 'data')
+
+    def finalize_options(self):
+        pass
+
+    def run(self):
+      # Late import -- this can only be run when avro is on the pythonpath,
+      # more or less after install.
+      import avro.test.gen_interop_data
+      if not os.path.exists(self.output_path):
+        os.makedirs(self.output_path)
+      avro.test.gen_interop_data.generate(self.schema_file,
+                                          os.path.join(self.output_path, 'py.avro'))
+
 
 def _get_version():
   curdir = os.getcwd()
-  version_file = ("VERSION.txt" if os.path.isfile("VERSION.txt")
-    else os.path.join(curdir[:curdir.index("lang/py")], "share/VERSION.txt"))
+  if os.path.isfile("share/VERSION.txt"):
+    version_file = "share/VERSION.txt"
+  else:
+    index = curdir.index("lang/py")
+    path = curdir[:index]
+    version_file = os.path.join(path, "share/VERSION.txt")
   with open(version_file) as verfile:
     # To follow the naming convention defined by PEP 440
     # in the case that the version is like "x.y.z-SNAPSHOT"
@@ -41,7 +124,7 @@ def _get_version():
 class LintCommand(setuptools.Command):
     """Run pycodestyle on all your modules"""
     description = __doc__
-    user_options = []
+    user_options = []  # type: ignore
 
     def initialize_options(self):
         pass
@@ -60,33 +143,15 @@ class LintCommand(setuptools.Command):
         if p.wait():
             raise distutils.errors.DistutilsError("pycodestyle exited with a nonzero exit code.")
 
+def main():
+    if not _is_distribution():
+        _generate_package_data()
+
+    setuptools.setup(cmdclass={
+        "generate_interop_data": GenerateInteropDataCommand,
+        "lint": LintCommand,
+    })
+
 
-setuptools.setup(
-  name = 'avro',
-  version = _get_version(),
-  packages = ['avro'],
-  package_dir = {'': 'src'},
-  scripts = ["./scripts/avro"],
-  setup_requires = [
-    'isort',
-    'pycodestyle',
-  ],
-  cmdclass={
-      "lint": LintCommand,
-  },
-
-  #include_package_data=True,
-  package_data={'avro': ['LICENSE', 'NOTICE']},
-
-  # metadata for upload to PyPI
-  author = 'Apache Avro',
-  author_email = 'dev@avro.apache.org',
-  description = 'Avro is a serialization and RPC framework.',
-  license = 'Apache License 2.0',
-  keywords = 'avro serialization rpc',
-  url = 'https://avro.apache.org/',
-  extras_require = {
-    'snappy': ['python-snappy'],
-    'zstandard': ['zstandard'],
-  },
-)
+if __name__ == '__main__':
+    main()
diff --git a/lang/py/test/set_avro_test_path.py b/lang/py/test/set_avro_test_path.py
deleted file mode 100644
index 29b666c..0000000
--- a/lang/py/test/set_avro_test_path.py
+++ /dev/null
@@ -1,44 +0,0 @@
-#!/usr/bin/env python
-
-##
-# Licensed to the Apache Software Foundation (ASF) under one
-# or more contributor license agreements.  See the NOTICE file
-# distributed with this work for additional information
-# regarding copyright ownership.  The ASF licenses this file
-# to you under the Apache License, Version 2.0 (the
-# "License"); you may not use this file except in compliance
-# with the License.  You may obtain a copy of the License at
-#
-# https://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-
-"""
-Module adjusts the path PYTHONPATH so the unittests
-will work even if an egg for AVRO is already installed.
-By default eggs always appear higher on pythons path then
-directories set via the environment variable PYTHONPATH.
-
-For reference see:
-https://www.velocityreviews.com/forums/t716589-pythonpath-and-eggs.html
-https://stackoverflow.com/questions/897792/pythons-sys-path-value.
-
-Unittests would therefore use the installed AVRO and not the AVRO
-being built. To work around this the unittests import this module before
-importing AVRO. This module in turn adjusts the python path so that the test
-build of AVRO is higher on the path then any installed eggs.
-"""
-
-from __future__ import absolute_import, division, print_function
-
-import os.path
-import sys
-
-# Make sure all paths that start with the
-# build directory are at the top of the path
-builddir = os.path.dirname(os.path.dirname(__file__))
-sys.path[:0] = [p for p in sys.path if p.startswith(builddir)]
diff --git a/share/test/interop/bin/test_rpc_interop.sh b/share/test/interop/bin/test_rpc_interop.sh
index 6f65b1d..283aff1 100755
--- a/share/test/interop/bin/test_rpc_interop.sh
+++ b/share/test/interop/bin/test_rpc_interop.sh
@@ -24,8 +24,8 @@ VERSION=`cat share/VERSION.txt`
 java_client="java -jar lang/java/tools/target/avro-tools-$VERSION.jar rpcsend"
 java_server="java -jar lang/java/tools/target/avro-tools-$VERSION.jar rpcreceive"
 
-py_client="env PYTHONPATH=lang/py/build/src python -m avro.tool rpcsend"
-py_server="env PYTHONPATH=lang/py/build/src python -m avro.tool rpcreceive"
+py_client="env PYTHONPATH=lang/py python -m avro.tool rpcsend"
+py_server="env PYTHONPATH=lang/py python -m avro.tool rpcreceive"
 
 py3_client="env PYTHONPATH=lang/py3 python3 -m avro.tool rpcsend"
 py3_server="env PYTHONPATH=lang/py3 python3 -m avro.tool rpcreceive"
@@ -40,7 +40,7 @@ proto=share/test/schemas/simple.avpr
 
 portfile=/tmp/interop_$$
 
-function cleanup() {
+cleanup() {
   rm -rf "$portfile"
   for job in `jobs -p` ; do
     kill $(jobs -p) 2>/dev/null || true;


Mime
View raw message