Having written my bachelor's thesis on how negation in sentences affects their sentiment: it is really, really difficult. Even just differentiating between negative/neutral/positive sentiment is successful only about ~65% of the time (depending on the source material). Text based Irony/Sarcasm detection is still an unsolved problem (most of the times even for humans, as it is strongly context dependent, not to mention missing indicators such as tone of voice and body language). Basically, you are way better of listening to your own intuition rather than using a computer to flip a coin.
Short answer: yes, crowd sourcing would work better.
Long answer: It's difficult to determine how good/bad people actually are at detecting the correct sentiment, as data sets containing phrase/sentence <-> sentiment pairs are often created by majority decision of human taggers. E.g. 7 people are given the same training examples and whatever most of them choose is then used as "correct" answer (gold standard). This might not be the real correct answer though. However, even if we accept this gold standard to actually be the absolute truth, most humans only have a correct detection rate of about ~80% (this is a very rough number, as it depends strongly on the source material, e.g. Tweets, product reviews, etc.). Still, this is way better than computers perform at the moment.
Then again, I assume those texts are written by humans for humans. So isn't the "correct" sentiment exactly what humans tend to make of it? And if humans aren't very good at detecting the sentiment, maybe the writer is at fault, not the readers.
I think letting a number of people read the text and choosing the majority vote as the text's sentiment might not actually be a very bad way of determining that.
It might be correct to say that a group of humans is interpreting the sentiment "incorrectly" if they don't have all the relevant context / information.