DISCUSSION: Functional Decomposition?

Michael S · Oct 24, 2005

Hi.

I just had a look at a J2EE-project that uses Maven all the way. All the
way!
Hence, by default, some kinda code-checker gives a red flag for a method
being longer than 45 lines (and also for any line longer than 80
characters).

While being busy posting in threads regarding if code should be a sequence
or a one liner (regexp vs. for/if/for/orelse) this matter came to my mind.

Why do programmers break up long methods that is a huge sequence into
several tiny private methods that the sequence then calls? Is this a good
thing? Why?

Maven have already decided that a long methods is The Bad Thing.
I think Maven takes that for granted. I also think it is a really dumb
assumption. And a dumb value.
Why 45 lines? If there is call for a limit, why wouldn't 20 or 115 lines do
the trick?

Have they asked maintainers what they like to read and modify?

I have a couple of mind-dollars to spend on this, but just did open with 2
cents on why this is just dumb.
Anyone wanna call? Or raise?

Happy Coding?
- Michael S

Frank Dzaebel · Oct 24, 2005

Hi Michael,

Hence, by default, some kinda code-checker gives a red flag for a method
being longer than 45 lines (and also for any line longer than 80 characters).
Why do programmers break up long methods that is a huge sequence into
several tiny private methods that the sequence then calls? Is this a good
thing? Why?

Here some (surely not complete) reasons from my side:

- A human can catch only a special amount of information
at one time. Also the methode-code fits to the format of
a pice of paper, to which a human is accustomed to.

- it autodocuments your code, because you must find
a) a good name for the method
b) a good point to call (making the structure more obvoius)
c) automatically creates an overview, because
in the method view, you see the different "milestones"
of the implementation of the method.
d) Auto-XML-Commenter now can easily parse
the code and create better mor meaningful class-
view and documentation.

- because Intellisense of the modern IDE's this
becomes even more meaningful. You can simply press
a key and you are switching to the function. You have
a summary-tag-description for the Method. So you
only view the essential information of the method.

Sometimes long methods are ok. But I recommend
not to go over ~45 as rule of thumb.
This is my personal advice/experience of 14 years
of software development.

ciao Frank

kevin cline · Oct 24, 2005

I've tried it both ways -- long methods, and many shorter methods.
Shorter methods win because:

1. The code is easier to read and understand. Instead of a 100-line
function, you have five smaller 20-line functions driven by a 5-line
function. That five-line function acts as the index for the rest of
the code. It's a lot easier to find the code that calculates sales tax
when there is a function named CalculateSalesTax().

2. Large functions are almost never reusable. Small functions often
are. So some of those little functions will turn out to be useful in
other places. In my experience, a lot of them are. So instead of
three 100-line functions, you end up with eight 20-line functions and
two 5-line driver functions.

Nicholas Paldino [.NET/C# MVP] · Oct 24, 2005

Michael,

The use of a number of lines is a way to place a quantifiable value on
something that is difficult to quantify. The idea behind this is to keep
your functions small, and make up the larger operations of calls to smaller
functions. This lends to the maintainability of the code, since operations
are encapsulated better, allowing you to make fixes and changes easier
without having to alter one big large blob of code.

Hope this helps.

Michael S · Oct 24, 2005

Frank Dzaebel said:
not to go over ~45 as rule of thumb.

Thanks for your input. Loved it.
But if this is not a rule of thumb, but Maven giving a red flag and refuse
to deploy! What then?

- Michael S

Michael S · Oct 24, 2005

Nicholas Paldino said:
Michael,

The use of a number of lines is a way to place a quantifiable value on
something that is difficult to quantify. The idea behind this is to keep
your functions small, and make up the larger operations of calls to
smaller functions. This lends to the maintainability of the code, since
operations are encapsulated better, allowing you to make fixes and changes
easier without having to alter one big large blob of code.

Hope this helps.

No it didn't help Nicholas.

You are just stating the obvious. While all what you are saying can easily
be find in a textbook and help you pass an exam at any university; I'd like
to know why; Why do we do this? Why 45? Why not 42; atleast that would make
more sense to me, being a fan. And why a no go and a red flag if you have 46
lines of code? Why the check?

What does the line-checker know that I don't know as a coder?

- Michael S

Michael S · Oct 24, 2005

kevin cline said:
I've tried it both ways -- long methods, and many shorter methods.
Shorter methods win because:

1. The code is easier to read and understand. Instead of a 100-line
function, you have five smaller 20-line functions driven by a 5-line
function. That five-line function acts as the index for the rest of
the code. It's a lot easier to find the code that calculates sales tax
when there is a function named CalculateSalesTax().

Could that 'driver' amount to more than 45 lines of code?
If so, would you create a driver to several drivers?

Note that I'm still talking about inside a class. Private methods.

2. Large functions are almost never reusable. Small functions often
are. So some of those little functions will turn out to be useful in
other places. In my experience, a lot of them are. So instead of
three 100-line functions, you end up with eight 20-line functions and
two 5-line driver functions.

That is another topic.
I'm talking about splitting up a sequence into smaller methods just because
Maven says so.
Since when did bot rule brain? Did I miss a meeting?

- Michael S

Nicholas Paldino [.NET/C# MVP] · Oct 24, 2005

Michael,

If it is obvious, then I don't see why you posted the question in the
first place.

The number nothing more than a guideline that the product you are using,
and a good number of people happen to use. As with all things, if this
number doesn't suit your needs, then don't use it. Additionally, if your
product doesn't allow you to set this threshold for yourself, then there is
something inherently wrong with the design of the product.

The line checker doesn't know anything. It's a program that has been
told to flag something because of what the author of the program told it to
flag. If you don't like what it does for you, then get another product.

If you think that it is a dumb assumption to have lines of code longer
than 80 chars and functions more than 45 lines, then that's fine. As stated
before, it is a personal preference. While most people would agree that
smaller, functional units would be more beneficial to the maintinence of a
program, those limits would be arbitrary for each individual. Of course,
computers being what they are, they can't gauge arbitrary things, so a line
has to be drawn somewhere. Your code analyzer is drawing it at 45 lines at
80 characters a line.

This is what I mean when I say it is an attempt to place a quantifiable
value on something that is inherently not quantifiable.

Other than that, there is no need to rip on those that are taking time
out of their day to try and address your questions.

Bill Butler · Oct 24, 2005

Michael S said:
Why do we do this? Why 45? Why not 42; atleast that would make more sense to me, being a fan. And
why a no go and a red flag if you have 46 lines of code? Why the check?

OK, I'll bite.

<sarcasm>
Because a bunch of zealots decided to protect you from yourself
</sarcasm>

There is no GOOD reason for an arbitrary limit like this. (At least make it configurable/optional)
There is no GOOD reason why you can't bypass it. (Are you sure you can't?)
Every GOOD rule has times when it makes sense to break it.

None of us(not even Michael) are arguing in favor of LONG monolithic methods. (COBOL anyone?) But,
there ARE times when the driver method makes MORE sense as one continuous logic stream rather than
arbitrarily breaking it up to fit into 45 lines.

There have been a few times over my career when I was unhappy about how sprawling a
method/function/subroutine/procedure/perform had become. At those times you try to refactor the
program to improve the overall understandability. Sometimes the logic does not separate well.
Sometime there are quite a few steps to the logic and they all belong at the same logical level.
Sometimes it makes more sense to have a LONG method. OK, "LONG" is a relative term. It is longer
than average, but far shorter than some of the monstrosities I have inherited over the years.

An arbitrary limit of 45 that can't be worked around under certain circumstances is just plain
stupid.(Are you SURE you can't find a workaround?)

Oh well, those are my 2 cents.
I'm sure I pissed off a zealot or two
Bill

Bruce Wood · Oct 24, 2005

I, too, would complain about a product that guaged how readable my
methods were based on lines of code. That is, as you stated, stupid.

However, I don't mind an aribtrary limit on method complexity. The
problem is that the tool would have to distinguish between a
_construct_ and a _line_ of code. The best example I can think of is
the switch statement with 100 cases. The tool really should evaluate
such a switch as the maximum number of constructs within any given case
plus one for the switch itself. A switch statement that had at most two
lines of code (that is, two statements) for each case should evaluate
to "3 lines" in total, because if every case does a similar thing then
that is, in fact, the mental complexity that it presents.

Similarly, loops should be evaluated as the complexity of their
contents plus a fixed overhead for the loop itself (perhaps 1 for
foreach and 2 for for), regardless of how it's formatted.

A tool that would conclude that this:

for (int i = 0; i < arr.Length; i++) { arr2 = arr * 100; }

is only a quarter as complex as this:

for (int i = 0; i < arr.Length; i++)
{
arr2 = arr * 100;
}

deserves to be tossed out with the trash, IMHO.

In other words, in the end, there's nothing wrong with an arbitrary
limit, so long as the tool _can parse the language_. A tool that just
counts characters and lines does nothing more than encourage nasty code
formatting in order to "simplify" the code from the tool's point of
view.

Of course, even with this refinement it would still be possible to
defeat the tool. A switch statement with 100 cases, where each case
does something completely unrelated to the other ones would not receive
a meaningful complexity value from a tool. Similarly one could (as I'm
sure happens in your workplace), divide large methods into meaningless
"sub-routines" just to get around the 45-line limit, leading to stupid
method names like "Step1", "Step2", etc. Even given the need to police
these evils, though, it's probably better that just letting people run
wild.

Brian Gideon · Oct 24, 2005

Michael said:
Hi.

I just had a look at a J2EE-project that uses Maven all the way. All the
way!
Hence, by default, some kinda code-checker gives a red flag for a method
being longer than 45 lines (and also for any line longer than 80
characters).

80 characters? That's pretty restrictive isn't it?

Michael S · Oct 24, 2005

Nicholas Paldino said:
Michael,
Other than that, there is no need to rip on those that are taking time
out of their day to try and address your questions.

Oh, yes there is my friend..

For example, I ripped on you as I know you could do better. And you did.
While you took the time to get angry, you give us all an excellent post, and
everyone is a winner.

Happy Discussion
- Michael S

Jon Skeet [C# MVP] · Oct 24, 2005

Michael S said:
I just had a look at a J2EE-project that uses Maven all the way. All the
way!
Hence, by default, some kinda code-checker gives a red flag for a method
being longer than 45 lines (and also for any line longer than 80
characters).

While being busy posting in threads regarding if code should be a sequence
or a one liner (regexp vs. for/if/for/orelse) this matter came to my mind.

Why do programmers break up long methods that is a huge sequence into
several tiny private methods that the sequence then calls? Is this a good
thing? Why?

Yes, it's a good thing. It makes the individual sections easier to
understand and the overall structure easier to understand too. If I
need to know roughly how the "overarching" method works, I can just
look at the short list of calls it makes. If I need to know how one
particular section works in detail, I can look at that method without
needing to worry about the other methods.

I don't think it's worth having arbitrary limits, but I definitely
support refactoring to break large methods into smaller ones.

Jon Skeet [C# MVP] · Oct 24, 2005

Michael S said:
No it didn't help Nicholas.

You are just stating the obvious.

It apparently wasn't obvious to you in your first post:

<quote>
Why do programmers break up long methods that is a huge sequence into
several tiny private methods that the sequence then calls? Is this a
good thing ?Why?
</quote>

The question of concrete limits is quite separate to whether splitting
a long method up into shorter ones is a good idea in general.

Michael S · Oct 24, 2005

Brian Gideon said:
80 characters? That's pretty restrictive isn't it?

While I always print code in 'landscape' and not 'portrate' I think it is...
=)
But the bot seem to get the upper hand and say I do stuff wrong...

And when the code-checker in Maven says it is wrong, it won't deploy until I
'fix the problem'.
I almost feel sorry for people who address this forum and ask if they should
mark classes with some weird attributes just becuase FxCop says so. Atleast
they can say no thanks. But I do feel very much sorry for the dudes in Maven
that not only get hinted at, but ordered to solve, a potential problem in
Java.

Use brain!

Happy Coding
- Michael S

Jon Skeet [C# MVP] · Oct 24, 2005

Michael S said:
While I always print code in 'landscape' and not 'portrate' I think it is...
=)
But the bot seem to get the upper hand and say I do stuff wrong...

And when the code-checker in Maven says it is wrong, it won't deploy until I
'fix the problem'.
I almost feel sorry for people who address this forum and ask if they should
mark classes with some weird attributes just becuase FxCop says so. Atleast
they can say no thanks. But I do feel very much sorry for the dudes in Maven
that not only get hinted at, but ordered to solve, a potential problem in
Java.

You can say no thanks too - you can change the build scripts. I'd be
very, very surprised if this was something Maven itself forced on *all*
projects. It sounds much more likely that this is something which is
part of the build script for the particular project you're looking at.
I suspect if you look for the numbers "45" and "80" in the appropriate
XML files, you'll find where they're being used, and could either
remove the checker altogether or at least change the limits.

Guest · Oct 25, 2005

Hi Michael,

I agree with you on this. I don't break long methods up into a sequence of
shorter methods - I think that just makes the code harder to read. However, I
do like to use comments, and straight lines ("//--------....") to break up
the long method. The only time I create a method is when I expect it to be
called from more than one place.

Regards,

Javaman.

Chris Dunaway · Oct 25, 2005

Michael said:
to know why; Why do we do this? Why 45? Why not 42; atleast that would make

In addition to what has already been said, the numbers (80 chars per
line, and 45 lines) probably has its roots in console type programs
where the screen size is 80 characters wide by 40 lines deep.

Bruce Wood · Oct 25, 2005

My convention is different: I do break up long methods into shorter
methods, even if those shorter methods will be called from only one
place.

However, I have a requirement: the shorter method must have a clear,
simple purpose that I can state in one sentence, and it must be clear
from the name of the method what it does. In other words, I often
extract functionality from a larger method if that little clump of
functionality represents a clear, simple operation that can fit into my
brain as one thought.

For example, I'm quite happy with methods like:

CalculateWeightPerStockingUnit
PromptUserForWarehouses

but I avoid like the plague methods that require names like this

OpenFilesAndLogErrors
CalculateAndStoreSpecificGravity
Step5

the "And" is a clear indication to me that the code in that method is
just a bunch of unrelated stuff that's been dumped in a bin together
for the sake of arbitrary packaging. Vague, meaningless names like the
last one tell me that someone is trying to break up code for the sake
of breaking up code.

However, if I can gather a dozen lines or so and move them off into a
separate method called "CalculateSpecificGravity", I think that it
makes the code _easier_ to read: you know what those dozen lines are
supposed to do, you can see what inputs are required (especially if the
method is static), and you can see what result it's expected to
produce. You can test it in isolation. It's all good, even if the
method is called in only one place.

Mike Hofer · Oct 25, 2005

Michael said:
That is another topic.
I'm talking about splitting up a sequence into smaller methods just because
Maven says so.
Since when did bot rule brain? Did I miss a meeting?

Actually, it's a related topic, and one that has a direct impact on why
you would want to limit your functions to smaller sizes. It shouldn't
be dismissed out of hand (not that you're doing so).

Are you sure Maven doesn't allow you to customize the limit for
function sizes? Most modern tools allow you to set values to something
appropriate to your own business process.

Finally, if Maven doesn't allow you to do this, and you CAN'T make it
happy, and it's preventing you from deploying your application, you
shouldn't be using it. You have a product to get out the door. If the
bot is preventing you from doing it due to some arbitrary rule that it
imposes, you should fire the bot as soon as possible.

My own two bits is this: smaller routines are easier for my mind to
grapple with. In addition, once I've debugged a routine and know that
it works, it's a piece I don't have to worry about again. Larger
functions are harder for me to deal with because there's too much going
on at once.

I write smaller routines so it's easier to maintain; Maven is likely
trying to force the same habit on you. But it *HAS* to be flexible
enough to allow for larger routines when they're called for. If it's
not doing that, it's got some serious issues.

DISCUSSION: Functional Decomposition?

Michael S

Frank Dzaebel

kevin cline

Nicholas Paldino [.NET/C# MVP]

Michael S

Michael S

Michael S

Nicholas Paldino [.NET/C# MVP]

Bill Butler

Bruce Wood

Brian Gideon

Michael S

Jon Skeet [C# MVP]

Jon Skeet [C# MVP]

Michael S

Jon Skeet [C# MVP]

Guest

Chris Dunaway

Bruce Wood

Mike Hofer