About 14,100,000 results
Open links in new tab
  1. A starting point for learning how to implement …

    There is a great tutorial on how to run Python with Hadoop Streaming on Amazon EMR. It immediately shows a simple but practical application. To learn the Hadoop itself I would …

  2. hadoop - Pyspark: get list of files/directories on HDFS path - Stack ...

    Oct 30, 2020 · As per title. I'm aware of textFile but, as the name suggests, it works only on text files. I would need to access files/directories inside a path on either HDFS or a local path. I'm …

  3. hadoop - How to Access Hive via Python? - Stack Overflow

    Jan 27, 2014 · @python-starter , your method only works when the hive resides on the same server where python is installed. If you are accessing the hive tables on a remote server, I …

  4. Get list of files from hdfs (hadoop) directory using python script

    Get list of files from hdfs (hadoop) directory using python script Asked 9 years, 11 months ago Modified 3 years, 2 months ago Viewed 21k times

  5. How can I include a python package with Hadoop streaming job?

    Jul 25, 2011 · I am trying include a python package (NLTK) with a Hadoop streaming job, but am not sure how to do this without including every file manually via the CLI argument, "-file". Edit: …

  6. How to save a file in hadoop with python - Stack Overflow

    Oct 28, 2014 · Note - If you created this file from a python script called in Hadoop, the intermediate csv file may be stored on some random nodes. Since this file is (presumably) no …

  7. python 3.x - What's the best module for interacting with HDFS …

    Oct 27, 2016 · Python (2 and 3) bindings for the WebHDFS (and HttpFS) API, supporting both secure and insecure clusters. Command line interface to transfer files and start an interactive …

  8. How to connect in Python 3.7 to a hadoop/hive database (that …

    Jan 25, 2019 · How to connect in Python 3.7 to a hadoop/hive database (that requires authentication) and execute a simple SQL query? Asked 6 years, 10 months ago Modified 4 …

  9. How to execute a python file (.py) on hadoop distributed file …

    Oct 25, 2017 · Hadoop offers the possibility to run java applications directly on the cluster using hadoop jar <jar> Now I have a python script instead of a java application. Below is the …

  10. python - winutls for Windows 11 64bit - Stack Overflow

    Nov 14, 2023 · python apache-spark github hadoop windows-11 asked Nov 14, 2023 at 20:01 Sarvesh Pandey 61 1 8