troubleshoot a neural network

  • Status: Closed
  • Prize: $30
  • Entries Received: 2
  • Winner: kkapoor97

Contest Brief

I wrote a neural network using Python and Tensorflow. The code and sample data are here: https://drive.google.com/open?id=1vXRIEunxgunqLGRJHEW_w29FXM7Ejjqy
Unfortunately the results are not converging, even with trivial data that mathematically should converge due to overfitting.
This is because there is a bug in the code, somewhere.
The first person to identify the full complete BUG FIX which causes the network to perform correctly (converge towards zero error), wins the prize.

*** THE PRIZE IS NOW INCREASED TO $20. ***
*** DO NOT WAIT UNTIL THE LAST MINUTE TO SUBMIT YOUR SOLUTION:
*** THE CONTEST WILL BE ENDED EARLY WHEN THE FIRST CORRECT SOLUTION IS SUBMITTED

### HELLO ALL CONTESTANTS ###
I am starting to understand what is wrong with this code, but I don't know the Tensorflow/Python syntax to fix it:
The onehot() function converts each character to a 74x1 vector and then flattens all three into a 222x1 vector (74 + 74 + 74 = 222).
The problem is that the softmax function is applied to all 222 logits simultaneously. This is logically incorrect and the softmax should be applied separately to each 74x1 vector. This is because the sum of probabilities of all softmax inputs should always be 1.0. In this case each set of 74 logits has a total probability of 1. It is nonsense to take the softmax of all 222 values because they do not have a total probability of 1 as a group. The probability of 1 is tied instead to each of 3 inputs from each set of 74.

The use of softmax for this task is described in this paper: https://link.springer.com/chapter/10.1007/978-3-319-67077-5_42

THE PRIZE IS INCREASED TO $30.

** ATTENTION ALL CONTESTANTS ** the code has a typo, the variable training_x should be train_inputs. But this is not the only bug.

** VERY IMPORTANT ** The one-hot-encoding of output is a critical requirement of this problem. You cannot solve it by removing the OHE and removing softmax cross-entropy. That is not an acceptable solution. The only acceptable solution is to partition the softmax computation for each independent variable. I don't know how to do that.

Recommended Skills

Employer Feedback

“Good results in the end after several rounds of testing.”

Profile image afterhourstech, United States.

Top entries from this contest

View More Entries

Public Clarification Board

  • Vijayabhaskar96
    Vijayabhaskar96
    • 5 years ago

    Why is my code rejected, you didn't even ask me what I did, before I was going to explain, you assumed something and rejected my entry?

    • 5 years ago
  • afterhourstech
    Contest Holder
    • 5 years ago

    **Attention all contestants** I think this would be considered a "multi-tasking" network since there is multiple sets of mutually exclusive classes. Softmax and entropy must be used for each set. Currently all class sets are flattened into a single array. But maybe this is the bug. I'm not sure.

    • 5 years ago
  • afterhourstech
    Contest Holder
    • 5 years ago

    PRIZE IS INCREASED TO $20

    • 5 years ago

How to get started with contests

  • Post your contest

    Post Your Contest Quick and easy

  • Get tons of entries

    Get Tons of Entries From around the world

  • Award the best entry

    Award the best entry Download the files - Easy!

Post a Contest Now or Join us Today!