11# Anatomy of an Implementation
22
3- This section explains a detailed implementation of the LearnAPI.jl for naive [ ridge
3+ This tutorial details an implementation of the LearnAPI.jl for naive [ ridge
44regression] ( https://en.wikipedia.org/wiki/Ridge_regression ) with no intercept. The kind of
55workflow we want to enable has been previewed in [ Sample workflow] ( @ref ) . Readers can also
66refer to the [ demonstration] (@ref workflow) of the implementation given later.
@@ -35,8 +35,7 @@ A transformer ordinarily implements `transform` instead of `predict`. For more o
3535 then an implementation must: (i) overload [`obs`](@ref) to articulate how
3636 provided data can be transformed into a form that does support
3737 this interface, as illustrated below under
38- [Providing a separate data front end](@ref), and which may additionally
39- enable certain performance benefits; or (ii) overload the trait
38+ [Providing a separate data front end](@ref); or (ii) overload the trait
4039 [`LearnAPI.data_interface`](@ref) to specify a more relaxed data
4140 API.
4241
@@ -62,7 +61,7 @@ nothing # hide
6261
6362Instances of ` Ridge ` are * [ learners] (@ref learners)* , in LearnAPI.jl parlance.
6463
65- Associated with each new type of LearnAPI.jl [ learner] ( @ ref learners) will be a keyword
64+ Associated with each new type of LearnAPI.jl learner will be a keyword
6665argument constructor, providing default values for all properties (typically, struct
6766fields) that are not other learners, and we must implement
6867[ ` LearnAPI.constructor(learner) ` ] ( @ref ) , for recovering the constructor from an instance:
@@ -365,9 +364,41 @@ y = 2a - b + 3c + 0.05*rand(n)
365364An implementation may optionally implement [ ` obs ` ] ( @ref ) , to expose to the user (or some
366365meta-algorithm like cross-validation) the representation of input data internal to ` fit `
367366or ` predict ` , such as the matrix version ` A ` of ` X ` in the ridge example. That is, we may
368- factor out of ` fit ` (and also ` predict ` ) the data pre-processing step, ` obs ` , to expose
369- its outcomes. These outcomes become alternative user inputs to ` fit ` /` predict ` . To see the
370- use of ` obs ` in action, see [ below] (@ref advanced_demo).
367+ factor out of ` fit ` (and also ` predict ` ) a data pre-processing step, ` obs ` , to expose
368+ its outcomes. These outcomes become alternative user inputs to ` fit ` /` predict ` .
369+
370+ In the default case, the alternative data representations will implement the MLUtils.jl
371+ ` getobs/numobs ` interface for observation subsampling, which is generally all a user or
372+ meta-algorithm will need, before passing the data on to ` fit ` /` predict ` as you would the
373+ original data.
374+
375+ So, instead of the pattern
376+
377+ ``` julia
378+ model = fit (learner, data)
379+ predict (model, newdata)
380+ ```
381+
382+ one enables the following alternative (which in any case will still work, because of a
383+ no-op ` obs ` fallback provided by LearnAPI.jl):
384+
385+ ``` julia
386+ observations = obs (learner, data) # pre-processed training data
387+
388+ # optional subsampling:
389+ observations = MLUtils. getobs (observations, train_indices)
390+
391+ model = fit (learner, observations)
392+
393+ newobservations = obs (model, newdata)
394+
395+ # optional subsampling:
396+ newobservations = MLUtils. getobs (observations, test_indices)
397+
398+ predict (model, newobservations)
399+ ```
400+
401+ See also the demonstration [ below] (@ref advanced_demo).
371402
372403Here we specifically wrap all the pre-processed data into single object, for which we
373404introduce a new type:
0 commit comments