@@ -71,12 +71,12 @@ one of those reasons, it is safe to train your TF-DF on train+validation (unless
7171the validation split is also used for something else, like hyperparameter
7272tuning).
7373
74- ``` python {.bad}
75- model.fit(train_ds, validation_data = val_ds)
74+ ``` diff {.bad}
75+ - model.fit(train_ds, validation_data=val_ds)
7676```
7777
78- ``` python {.good}
79- model.fit(train_ds.concatenate(val_ds))
78+ ``` diff {.good}
79+ + model.fit(train_ds.concatenate(val_ds))
8080
8181# Or just don't create a validation dataset
8282```
@@ -91,17 +91,17 @@ needed, it will be extracted automatically from the training dataset.
9191
9292#### Train for exactly 1 epoch
9393
94- ``` python {.bad}
94+ ``` diff {.bad}
9595# Number of epochs in Keras
96- model.fit(train_ds, num_epochs = 5 )
96+ - model.fit(train_ds, num_epochs=5)
9797
9898# Number of epochs in the dataset
99- train_ds = train_ds.repeat(5 )
100- model.fit(train_ds)
99+ - train_ds = train_ds.repeat(5)
100+ - model.fit(train_ds)
101101```
102102
103- ``` python {.good}
104- model.fit(train_ds)
103+ ``` diff {.good}
104+ + model.fit(train_ds)
105105```
106106
107107** Rationale:** Users of neural networks often train a model for N steps (which
@@ -116,13 +116,13 @@ unnecessary data I/O, as well as slower training.
116116Datasets do not need to be shuffled (unless the input_fn is reading only a
117117sample of the dataset).
118118
119- ``` python {.bad}
120- train_ds = train_ds.shuffle(5 )
121- model.fit(train_ds)
119+ ``` diff {.bad}
120+ - train_ds = train_ds.shuffle(5)
121+ - model.fit(train_ds)
122122```
123123
124- ``` python {.good}
125- model.fit(train_ds)
124+ ``` diff {.good}
125+ + model.fit(train_ds)
126126```
127127
128128** Rationale:** TF-DF shuffles access to the data internally after reading the
@@ -136,15 +136,15 @@ However, this will make the training procedure non-deterministic.
136136
137137The batch size will not affect the model quality
138138
139- ``` python {.bad}
140- train_ds = train_ds.batch(hyper_parameter_batch_size())
141- model.fit(train_ds)
139+ ``` diff {.bad}
140+ - train_ds = train_ds.batch(hyper_parameter_batch_size())
141+ - model.fit(train_ds)
142142```
143143
144- ``` python {.good}
144+ ``` diff {.good}
145145# The batch size does not matter.
146- train_ds = train_ds.batch(64 )
147- model.fit(train_ds)
146+ + train_ds = train_ds.batch(64)
147+ + model.fit(train_ds)
148148```
149149
150150*** Rationale:*** Since TF-DF is always trained on the full dataset after it is
@@ -214,37 +214,37 @@ transformations. By default, all of the features in the dataset (other than the
214214label) will be detected and used by the model. The feature semantics will be
215215auto-detected, and can be overridden manually if needed.
216216
217- ``` python {.bad}
217+ ``` diff {.bad}
218218# Estimator code
219- feature_columns = [
220- tf.feature_column.numeric_column(feature_1),
221- tf.feature_column.categorical_column_with_vocabulary_list(feature_2, [' First' , ' Second' , ' Third' ])
222- ]
223- model = tf.estimator.LinearClassifier(feature_columns = feature_columnes)
219+ - feature_columns = [
220+ - tf.feature_column.numeric_column(feature_1),
221+ - tf.feature_column.categorical_column_with_vocabulary_list(feature_2, ['First', 'Second', 'Third'])
222+ - ]
223+ - model = tf.estimator.LinearClassifier(feature_columns=feature_columnes)
224224```
225225
226- ``` python {.good}
226+ ``` diff {.good}
227227# Use all the available features. Detect the type automatically.
228- model = tfdf.keras.GradientBoostedTreesModel()
228+ + model = tfdf.keras.GradientBoostedTreesModel()
229229```
230230
231231You can also specify a subset of input features:
232232
233- ``` python {.good}
234- features = [
235- tfdf.keras.FeatureUsage(name = " feature_1" ),
236- tfdf.keras.FeatureUsage(name = " feature_2" )
237- ]
238- model = tfdf.keras.GradientBoostedTreesModel(features = features, exclude_non_specified_features = True )
233+ ``` diff {.good}
234+ + features = [
235+ + tfdf.keras.FeatureUsage(name="feature_1"),
236+ + tfdf.keras.FeatureUsage(name="feature_2")
237+ + ]
238+ + model = tfdf.keras.GradientBoostedTreesModel(features=features, exclude_non_specified_features=True)
239239```
240240
241241If necessary, you can force the semantic of a feature.
242242
243- ``` python {.good}
244- forced_features = [
245- tfdf.keras.FeatureUsage(name = " feature_1" , semantic = tfdf.keras.FeatureSemantic.CATEGORICAL ),
246- ]
247- model = tfdf.keras.GradientBoostedTreesModel(features = features)
243+ ``` diff {.good}
244+ + forced_features = [
245+ + tfdf.keras.FeatureUsage(name="feature_1", semantic=tfdf.keras.FeatureSemantic.CATEGORICAL),
246+ + ]
247+ + model = tfdf.keras.GradientBoostedTreesModel(features=features)
248248```
249249
250250** Rationale:** While certain models (like Neural Networks) require a
@@ -262,29 +262,29 @@ remove all pre-processing that was designed to help neural network training.
262262
263263#### Do not normalize numerical features
264264
265- ``` python {.bad}
266- def zscore (value ):
267- return (value- mean) / sd
265+ ``` diff {.bad}
266+ - def zscore(value):
267+ - return (value-mean) / sd
268268
269- feature_columns = [tf.feature_column.numeric_column(" feature_1" ,normalizer_fn = zscore)]
269+ - feature_columns = [tf.feature_column.numeric_column("feature_1",normalizer_fn=zscore)]
270270```
271271
272- ** Rationale :** Decision forest algorithms natively support non-normalized
272+ ** Rational :** Decision forest algorithms natively support non-normalized
273273numerical features, since the splitting algorithms do not do any numerical
274274transformation of the input. Some types of normalization (e.g. zscore
275275normalization) will not help numerical stability of the training procedure, and
276276some (e.g. outlier clipping) may hurt the expressiveness of the final model.
277277
278278#### Do not encode categorical features (e.g. hashing, one-hot, or embedding)
279279
280- ``` python {.bad}
281- integerized_column = tf.feature_column.categorical_column_with_hash_bucket(" feature_1" ,hash_bucket_size = 100 )
282- feature_columns = [tf.feature_column.indicator_column(integerized_column)]
280+ ``` diff {.bad}
281+ - integerized_column = tf.feature_column.categorical_column_with_hash_bucket("feature_1",hash_bucket_size=100)
282+ - feature_columns = [tf.feature_column.indicator_column(integerized_column)]
283283```
284284
285- ``` python {.bad}
286- integerized_column = tf.feature_column.categorical_column_with_vocabulary_list(' feature_1' , [' bob' , ' george' , ' wanda' ])
287- feature_columns = [tf.feature_column.indicator_column(integerized_column)]
285+ ``` diff {.bad}
286+ - integerized_column = tf.feature_column.categorical_column_with_vocabulary_list('feature_1', ['bob', 'george', 'wanda'])
287+ - feature_columns = [tf.feature_column.indicator_column(integerized_column)]
288288```
289289
290290** Rationale:** TF-DF has native support for categorical features, and will treat
@@ -315,11 +315,11 @@ networks, which may propagate NaNs to the gradients if there are NaNs in the
315315input, TF-DF will train optimally if the algorithm sees the difference between
316316missing and a sentinel value.
317317
318- ``` python {.bad}
319- feature_columns = [
320- tf.feature_column.numeric_column(" feature_1" , default_value = 0 ),
321- tf.feature_column.numeric_column(" feature_1_is_missing" ),
322- ]
318+ ``` diff {.bad}
319+ - feature_columns = [
320+ - tf.feature_column.numeric_column("feature_1", default_value=0),
321+ - tf.feature_column.numeric_column("feature_1_is_missing"),
322+ - ]
323323```
324324
325325#### Handling Images and Time series
@@ -381,22 +381,22 @@ dataset reads are deterministic as well.
381381
382382#### Specify a task (e.g. classification, ranking) instead of a loss (e.g. binary cross-entropy)
383383
384- ``` python {.bad}
385- model = tf.keras.Sequential()
386- model.add(Dense(64 , activation = relu))
387- model.add(Dense(1 )) # One output for binary classification
384+ ``` diff {.bad}
385+ - model = tf.keras.Sequential()
386+ - model.add(Dense(64, activation=relu))
387+ - model.add(Dense(1)) # One output for binary classification
388388
389- model.compile(loss = tf.keras.losses.BinaryCrossentropy(from_logits = True ),
390- optimizer = ' adam' ,
391- metrics = [' accuracy' ])
389+ - model.compile(loss=tf.keras.losses.BinaryCrossentropy(from_logits=True),
390+ - optimizer='adam',
391+ - metrics=['accuracy'])
392392```
393393
394- ``` python {.good}
394+ ``` diff {.good}
395395# The loss is automatically determined from the task.
396- model = tfdf.keras.GradientBoostedTreesModel(task = tf.keras.Task.CLASSIFICATION )
396+ + model = tfdf.keras.GradientBoostedTreesModel(task=tf.keras.Task.CLASSIFICATION)
397397
398398# Optional if you want to report the accuracy.
399- model.compile(metrics = [' accuracy' ])
399+ + model.compile(metrics=['accuracy'])
400400```
401401
402402** Rationale:** Not all TF-DF learning algorithms use a loss. For those that do,
@@ -489,13 +489,13 @@ print(tree)
489489TF-DF does not yet support TF distribution strategies. Multi-worker setups will
490490be ignored, and the training will only happen on the manager.
491491
492- ``` python {.bad}
493- with tf.distribute.MirroredStrategy():
494- model = ...
492+ ``` diff {.bad}
493+ - with tf.distribute.MirroredStrategy():
494+ - model = ...
495495```
496496
497- ``` python {.good}
498- model = ... .
497+ ``` diff {.good}
498+ + model = ....
499499```
500500
501501#### Stacking Models
0 commit comments