Examples/benchmarking by tdiethe · Pull Request #143 · amzn/MXFusion

tdiethe · 2018-12-20T23:39:20Z

Description of changes:

This is some benchmarking of Bayesian Neural Networks (meanfield VI) against a non-Bayesian NN. Hopefully this could provide a useful starting point for further analysis (e.g. different kinds of BNN).

The script examples/benchmarking/bnn_classification_benchmark.py runs through several datasets (MNIST, FashionMNIST, CIFAR10, CIFAR100), with 3 different NN architectures. Several metrics are computed (Accuracy, MSE (=Brier score), Log loss). Some "sensible" defaults are set for the hyperparameters - no HP tuning is performed. Results are stored in the results.txt file as a list of json strings.

Also added a notebook in the notebooks directory for exploring the results. This outputs figures to the directory examples/benchmarking/figs (figures also included).

Changes to MXFusion core files:

mxfusion/components/functions/mxfusion_gluon_function.py: Made the exception more helpful
mxfusion/inference/batch_loop.py: added a callback for custom status messages
mxfusion/inference/grad_based_inference.py: added GradIteratorBasedInference - a version of GradBasedInference that operates on a data loader
mxfusion/inference/minibatch_loop.py: fixed bug that stopped it working on GPUs; added callback for custom status messages

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

Added custom print function to minibatch loop

…rchitecture

Added extra datasets Added extra metrics Saving metrics to json Ready to run experiments

…hmarking

Added output plots Removed unneeded notebook

meissnereric · 2018-12-22T17:39:16Z

+
+    :param inference_algorithm: The applied inference algorithm
+    :type inference_algorithm: InferenceAlgorithm
+    :param grad_loop: The reference to the main loop of gradient optimization


Could you add a comment that this defaults to minibatch?

meissnereric · 2018-12-22T17:40:04Z

+        :param kwargs: The keyword arguments specify the data for inferences. The key of each argument is the name of
+        the corresponding variable in model definition and the value of the argument is the data in numpy array format.
+        """
+        # data = [kwargs[v] for v in self.observed_variable_names]


Can you remove this if you don't need it?

meissnereric · 2018-12-27T20:24:15Z

Looks cool Tom, haven't had a chance to actually go through what the results look like yet but the changes to the core MXFusion codebase look fine to me.

tdiethe · 2019-02-14T18:09:23Z

@meissnereric can you have a look at the failing tests? Don't think this was happening before.

meissnereric · 2019-02-15T11:21:06Z

I think this was happening before, I remember seeing it.

The reason is that you're using Python 3.6 only string formatting in places. this "(f"Context device id {ctx.device_id} outside range of list {ctx_list} or None")" style isn't supported in 3.4/3.5, use the classic "blah".format() style. Shouldn't be a big change, thanks Tom!

codecov-io · 2019-02-23T23:05:56Z

Codecov Report

Merging #143 into develop will decrease coverage by 0.41%.
The diff coverage is 30.43%.

@@             Coverage Diff             @@
##           develop     #143      +/-   ##
===========================================
- Coverage    85.19%   84.78%   -0.42%     
===========================================
  Files           78       78              
  Lines         3850     3917      +67     
  Branches       654      666      +12     
===========================================
+ Hits          3280     3321      +41     
- Misses         376      395      +19     
- Partials       194      201       +7

Impacted Files	Coverage Δ
...on/components/functions/mxfusion_gluon_function.py	`86.9% <0%> (ø)`	⬆️
mxfusion/inference/batch_loop.py	`80% <0%> (-20%)`	⬇️
mxfusion/inference/__init__.py	`100% <100%> (ø)`	⬆️
mxfusion/inference/grad_based_inference.py	`71.42% <33.33%> (-19.88%)`	⬇️
mxfusion/inference/minibatch_loop.py	`73.33% <40%> (-4.45%)`	⬇️
mxfusion/inference/inference_parameters.py	`84.4% <0%> (-4.49%)`	⬇️
mxfusion/models/factor_graph.py	`84.72% <0%> (-0.17%)`	⬇️
mxfusion/util/graph_serialization.py
mxfusion/util/serialization.py	`85.71% <0%> (ø)`
mxfusion/inference/inference.py	`83.33% <0%> (+1.51%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 96ccde7...01fd48c. Read the comment docs.

tdiethe added 24 commits December 18, 2018 18:18

BNN classification benchmarking first commit

7311eab

Added version of grad based inference for DataLoader objects

e50e262

Added custom print function to minibatch loop

Custom print function for grad based inference

e511224

Changed default epochs for BNN

82e9217

fixed context on init and changed first forward pass

83b91b5

another context fix. temporarily disabled vanilla nn

f128b45

bugfix

4d9581b

attempting to get gpu working

101bcbb

Added missing context

9714d90

Minor refactoring

499173a

Made minibatch loop GPU aware

b7fb813

Fixed context at test time

12af3d1

set context in accuracy evaluation

0b05e28

Updated learning rate. Changed optimizer. Results now closer (0.9274)

4a95a9b

Tweaking hyperparameters

07292c8

Added mlp class (from vcl). Refactored to use this class. Added new a…

3f70b5f

…rchitecture

extra print statement

91aa23f

minor edit

06b2df3

Tidied up

fe72289

Added extra datasets Added extra metrics Saving metrics to json Ready to run experiments

Merge branch 'develop' of github.com:amzn/MXFusion into examples/benc…

448633c

…hmarking

Fix to brier score

dcfbf76

fixed bug in metric computation

887f415

benchmarking results

409b0c6

Added notebook to browse benchmark results

5d8018d

Added output plots Removed unneeded notebook

meissnereric reviewed Dec 27, 2018

View reviewed changes

tdiethe added 2 commits January 17, 2019 13:51

Merge branch 'develop' into examples/benchmarking

37bd12a

Merge branch 'develop' into examples/benchmarking

20231e5

Removed python 3.7 print statements

01fd48c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Examples/benchmarking#143

Examples/benchmarking#143
tdiethe wants to merge 27 commits intoamzn:developfrom
tdiethe:examples/benchmarking

tdiethe commented Dec 20, 2018

Uh oh!

meissnereric Dec 22, 2018

Uh oh!

meissnereric Dec 22, 2018

Uh oh!

meissnereric commented Dec 27, 2018

Uh oh!

tdiethe commented Feb 14, 2019

Uh oh!

meissnereric commented Feb 15, 2019 •

edited

Loading

Uh oh!

codecov-io commented Feb 23, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

tdiethe commented Dec 20, 2018

Uh oh!

meissnereric Dec 22, 2018

Choose a reason for hiding this comment

Uh oh!

meissnereric Dec 22, 2018

Choose a reason for hiding this comment

Uh oh!

meissnereric commented Dec 27, 2018

Uh oh!

tdiethe commented Feb 14, 2019

Uh oh!

meissnereric commented Feb 15, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov-io commented Feb 23, 2019

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

meissnereric commented Feb 15, 2019 •

edited

Loading