I will separately relate to Java vs Python and then separately relate to MR vs Hive / Pig - since i see it as two different issues
Hadoop is built around java and many of its capabilities available via Java API, and Hadoop mostly can be extended using java classes.
Hadoop do has capability to work with MR jobs created in other languages - it is called streaming. This model only allow us to define mapper and reducer with some restrictions not present in java. In the same time - input/output formats and other plugins do have to be written as java classes
So I would define decision making as following:
a) Use Java, unless you have serious codebase you need to resue in Your MR job.
b) Consider to use python when you need to create some simple ad hoc jobs.
Regarding Pig / Hive - it is also java centric systems of higher level. Hive can be used without any programming at all, but it can be is extended using java. Pig require java from the beginning. I think this systems are almost always preferable to MR jobs in cases when they can be appliaed. Usually these are cases when processing is SQL like.
Performance considerations between streaming vs native Java.
Streaming feeds input to the mapper via its input stream. It is interprocess communication which is inherently less efficient then in-process data passing between record reader and mapper in case of java.
I can make a following conclusions from above:
a) In case of some light processing (like looking for substring, counting ...) this overhead can be significan and java solution will be more efficient.
b) In case of some heavy processing, which can be potentially implemented in some non-java language more efficiently - streaming based solution can have some edge.
Pig / Hive performance considerations.
Pig / Hive both implements primitives of the SQL processing. In other words - they implement elements of the execution plan in the RDBMS world. These implementations are good and well tuned. In the same time Hive (something I know better) is interpreter. It does not do code generation - it inteprpret execution plan within pre-built MR job(s). It mean that if you have sompe complex condtions and will write code specially for them - it have all chances to do much better then Hive - representing performance advantage of compiler vs interpeter.
MapReduce is just a computing framework. HBase has nothing to do with it. That said, you can efficiently put or fetch data to/from HBase by writing MapReduce jobs. Alternatively you can write sequential programs using other HBase APIs, such as Java, to put or fetch the data. But we use Hadoop, HBase etc to deal with gigantic amounts of data, so that doesn't make much sense. Using normal sequential programs would be highly inefficient when your data is too huge.
Coming back to the first part of your question, Hadoop is basically 2 things: a Distributed FileSystem (HDFS) + a Computation or Processing framework (MapReduce). Like all other FS, HDFS also provides us storage, but in a fault tolerant manner with high throughput and lower risk of data loss (because of the replication). But, being a FS, HDFS lacks random read and write access. This is where HBase comes into picture. It's a distributed, scalable, big data store, modelled after Google's BigTable. It stores data as key/value pairs.
Coming to Hive. It provides us data warehousing facilities on top of an existing Hadoop cluster. Along with that it provides an SQL like interface which makes your work easier, in case you are coming from an SQL background. You can create tables in Hive and store data there. Along with that you can even map your existing HBase tables to Hive and operate on them.
While Pig is basically a dataflow language that allows us to process enormous amounts of data very easily and quickly. Pig basically has 2 parts: the Pig Interpreter and the language, PigLatin. You write Pig script in PigLatin and using Pig interpreter process them. Pig makes our life a lot easier, otherwise writing MapReduce is always not easy. In fact in some cases it can really become a pain.
I had written an article on a short comparison of different tools of the Hadoop ecosystem some time ago. It's not an in depth comparison, but a short intro to each of these tools which can help you to get started.
(Just to add on to my answer. No self promotion intended)
Both Hive and Pig queries get converted into MapReduce jobs under the hood.
HTH
Best Answer
Complex branching logic which has a lot of nested if .. else .. structures is easier and quicker to implement in Standard MapReduce, for processing structured data you could use Pangool, it also simplifies things like JOIN. Also Standard MapReduce gives you full control to minimize the number of MapReduce jobs that your data processing flow requires, which translates into performance. But it requires more time to code and introduce changes.
Apache Pig is good for structured data too, but its advantage is the ability to work with BAGs of data (all rows that are grouped on a key), it is simpler to implement things like:
Hive is better suited for ad-hoc queries, but its main advantage is that it has engine that stores and partitions data. But its tables can be read from Pig or Standard MapReduce.
One more thing, Hive and Pig are not well suited to work with hierarchical data.