Ruby on Rails?

A

Andrew McDonagh

Chris said:
Hopefully, no one uses UML to document all the kinds of things that are
documented automatically by type systems. That would be a pain. Even
if they did, UML diagrams can generally get out of date, especially if
the code doesn't contain enough information to keep them up to date with
tools that do that sort of thing. JavaDoc (or any equivalent, as you
say Ruby has) can definitely get out of date. Documentation expressed
by the type system within the syntax of the language doesn't get out of
date.

I did say, 'depends upon what type you want' I didn't mean that to come
across as using them to document Types relations that are done by type
systems.
Unit tests as documentation is another thing. They are kept up to date
with code. The weakness, of course, lies in being so separated from
code; often residing in a different directory from the code that is
being documented by that test.

This is never a problem - the unit tests are considered as high a value
(or more) as the production code and therefore always compiled and run
as part of the build.

If they don't compile (because they haven't been kept up to date) then
the build fails.

If they don't ALL pass (because they haven't been kept upto date) then
the build fails.

Basically they either are up to date or the build fails.
The other weakness is that the tests are
in a language that as fundamentally just as opaque to reasoning about
behavior as the original programming language (in fact, it generally is
the same language). While one might hope that the code would be kept at
least a little simpler -- or perhaps that you'd write unit tests or
acceptance tests for the unit tests or acceptance tests -- the result
will still not match a form of expression designed for describing facts
about code that are the basis for simple reasoning about behavior.

Unit tests are best written in the same language as the code, as they
are only to describe what the code IS and how to USE it.

Their target audience is developers.

Acceptance Tests on the other hand should indeed be in a domain specfic
langauage. As this allows the stakeholders, testers and developers to
understand what the system is supposed to do.


Units tests are for Building the Code Right.

Acceptance Tests are for Building the Right Code

Interestingly, if I start to imagine languages in which simple unit
tests can be written inline to put the documentation closer to the code,
and possibly in a simpler language that is more prone to reasoning about
behavior, then I end up thinking about a proper design-by-contract
language. Hmm...


Having the unit tests in a parallel directory structure is only one of a
few standard ways of organising them. Lots of teams have the test code
in the same directory as the production code.


Sure, or other tools. Doesn't really matter if they are integrated or
not.


Yes. However, a good portion of the refactorings you list are
computationally intractable in Smalltalk in the general case. I am not
a Smalltalk programmer, and I don't know how the refactoring tools
handled that... perhaps they would give up, or perhaps they would guess
according to some heuristic to guess at the meaning of the code, and
rely on your unit tests to catch any errors that are introduced. Either
could be reasonable, depending on how good the heuristic can be made,
but either one is also a significant barrier to having usable tools.

And yet it doesn't in practice this does not seem to be the case....
So clearly, tools can be written for dynamic languages. The question is
whether they can be as good as tools for typed languages can be. (I'm
arguing the less popular side here; I would, however, probably agree if
you point out that an interactive tool that works 99.99% of the time is
just as good as one that works 100% of the time. I don't know if
Smalltalk's refactorings really do work 99.99% of the time or not.)

I haven't yet found a tool for statically typed languages that work
99.99% of the time (Eclipse, IntellJ, etc).

I don't personally find that the language type system has any impact on
usability or reliability.

Like, for example, implementation of polymorphic method dispatch in
practically the same time cost as a standard procedure call by using
vtables. Also, as an example, some of the work being done toward
optimizing away heap allocations by moving data onto the stack in Java
depends on static type analysis.

Chris Uppal has already answered this one better than I can.
 
C

Chris Smith

Sorry for the delay. For some reason, my server just now got all of
your posts from the last few days.

Chris Uppal said:
Hmm. I would be reluctant -- /extremely/ reluctant -- to talk of a "standard
text" on this subject.

Okay. It's the one I've heard of most often, but perhaps this is a
provincial thing.
It would certainly be completely wrong to speak of dynamically-typed languages
having dynamic type /analysis/ -- analysis is a static concept[*]. But talking
about type /systems/ is not the same as talking about type /analysis/. In an
extreme case it might be that the system existed, and was well-defined, but
wasn't subject to analysis. Such a system might not be much use (who knows ?),
but the idea is not /incoherent/. Dynamic languages certainly have type
systems -- there is an abstract logic of what [sequences of] operations have
meaning.

My dispute is whether that logic is tractable and syntactic. In other
words, is it sufficient that you could prove whether or not an operation
(or sequence) has meaning given the syntax of the language, but no
formal semantics such as evaluation rules or axioms. If this is not
true, then I'd think you would need to at least very carefully qualify
calling such things "types" at all.

It's interesting that the word "type" is used in exactly this way, in
general, with Java as well. When it's said that variables have types
but objects have classes, that means that only variables have
annotations about their possible values that are used in syntactic
analysis of the program to prove the absence of behaviors. Of course,
the behaviors whose absence is proven may in fact have to do with
objects -- for example, it is proven that a program may never call a
method that doesn't exist. However, the information used by these
checks is purely syntactic. Java is also dynamically checked, so errors
are raised at runtime if an incorrect downcast is performed or an array
is accessed out of bounds. However, we don't consider the class of an
object or the length of an array part of the "type" because it is not
used by the type system; only by the dynamic checks that ensure program
safety.
The other use of the phrase "type system" is to mean "that part of the language
system which does type checking". In C++ that part is in the compiler. In
Java it is split between the compiler and runtime. In Smalltalk it's all in
the runtime.

This is actually assuming the faulty definition of types to begin with.
If types are as they are understood in the texts I know on type theory
in CS, and as they are used in Java, for example, then it is *not* true
that type checking is done by the runtime in Smalltalk. It is true that
program safety is implemented by runtime checks in Smalltalk. However,
these checks relate to concepts that are not types.
You may not like that particular use of the phrase, but it seems
to have become established.

Indeed it is established in some circles, in any case -- mostly, as a
matter of fact, in communities of people working with some "dynamically
typed" language. I suspect that this has historical roots, springing
from the false idea that untyped languages would also necessarily be
unsafe. In any case, I'm not quite ready to give up the meaning of a
perfectly useful term like "type"... especially when there exists
perfectly well-respected literature that still uses the term.

You seem to be arguing for removing any kind of validity at all from the
word "untyped"... or perhaps reserve it for cases where the bad program
behavior that we can't prove to be absent is actually undefined by the
language rather than merely bad. This is either pointless (in the
former), or not very close to the ideas implied by the word "type" and
closer to the word "safe" (in the latter).
Consider the following assertion. The price that statically analysed languages
pay is that they must accept an impoverished type system (logic of allowable
operations) in order to have a practically implementable ("tractable";-) type
checker. Dynamic languages, on the other hand, have a type system (logic)
which is /exactly/ as wide as is possible, but at the cost of a type checker
working at runtime. You may or may not agree with the assertion; but my point
is that the assertion is meaningful.

I don't believe that it is meaningful, though I can infer what you mean
from my knowledge that "dynamically typed" is often used as a synonym of
"untyped". I can translate that to "dynamically checked", and I can
translate the statement to the following, which IS meaningful:

The price that typed languages pay is that they must accept an
impoverished program semantics in order to have a tractable type
checker. The benefit is that they can convert a subset of the
necessary safety checks into static constraints. Dynamic languages,
on the other hand, have a program semantics that is defined without
that specific constraint, but they lose the benefit, as well.

I think my statement is clearer than yours, incidentally, for a number
of reasons. They include not just a substitution of words; but also a
removal of the false dichotomy between "type checking" and other
language safety checks which just don't happen to be part of the type
checker in our hypothetical typed language above.

Incidentally, I do agree with the (translated) statement.

--
www.designacourse.com
The Easiest Way To Train Anyone... Anywhere.

Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation
 
C

Chris Smith

Chris Uppal said:
I don't think vtables are widely regarded as the best implementation technique,
even for languages which can use them. Routinely jumping through an
indirection is a killer.

You are possibly right for environments with less impoverished
development tools... but at least as of about five years ago, lookup in
the vtable was basically the way to do method dispatch for C++ virtual
method dispatch in run-of-the-mill build processes with separate per-
class compiling and a relatively dumb linker. Method inlining may be
possible if the compiler can do global program analysis, but that's not
commonly possible in that build environment. There is certainly no
reason to believe that doing this killed the performance of C++
applications.

Of course, there are a number of factors here. Typical C++ code
involves quite a bit fewer virtual method dispatches than typical
Smalltalk code (or even typical Java code). If the vtables end up
cached, though, the indirection doesn't really cost much of anything at
all. It is also an advantage of runtime native code generation (as in
Java and Smalltalk) that complete information about the environment is
available during the build.
Also, as an example, some of the work being done toward
optimizing away heap allocations by moving data onto the stack in Java
depends on static type analysis.

That's escape analysis [...]

You're right. I was confusing myself.

--
www.designacourse.com
The Easiest Way To Train Anyone... Anywhere.

Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation
 
C

Chris Smith

I did say, 'depends upon what type you want' I didn't mean that to come
across as using them to document Types relations that are done by type
systems.

Ah. I'm somewhat at a loss, then. I was talking about type systems
providing documentation automatically. That seems to have little to do
with UML at all.
This is never a problem - the unit tests are considered as high a value
(or more) as the production code and therefore always compiled and run
as part of the build.

This, again, seems to have nothing to do with whether unit tests provide
a benefit equivalent to that of the self-documentation of a type system.
Unit tests are best written in the same language as the code, as they
are only to describe what the code IS and how to USE it.

Their target audience is developers.

I'm not so sure that's true, or at least that it's the whole story.
Their target audience is at least two-fold: developers, and the compiler
and/or runtime that executes them. The question is whether they meet
the two aims well. Clearly, if the point is to demonstrate that certain
piece of code is correct, it is not sufficient to use a test case whose
correctness is just as much in doubt as the original code. This is even
less sufficient when we consider unit tests as documentation rather than
as active tests.

Acceptance tests may be in a different language, but that language is no
more suited to understanding the code than unit tests. Probably less
so, as it is designed for different goals.
Units tests are for Building the Code Right.
Acceptance Tests are for Building the Right Code

But these are not what we're talking about. We're talking about how to
help developers understand the assumptions and invariants and other
elements of meaning that can be attributed to code but is not contained
within the functional part of the code itself. Unit tests at least
sometimes help with that, and acceptance tests do nothing at all.
Having the unit tests in a parallel directory structure is only one of a
few standard ways of organising them. Lots of teams have the test code
in the same directory as the production code.

Yet rarely in the same source file sitting next to the code where it can
be easily used to see what the code is trying to do...
[Smalltalk refactoring]

And yet it doesn't in practice this does not seem to be the case....

Having never used the Smalltalk refactoring browser, I can't say much
here. It very well have have some very good heuristics involved so that
it almost never breaks your code or fails. I don't know. I know it
doesn't work ALL the time, but I can't say how often it does work.
I haven't yet found a tool for statically typed languages that work
99.99% of the time (Eclipse, IntellJ, etc).

Do you mean that Eclipse or IntelliJ or some other editor may, for
example, forget to rename call site when you choose to rename a method?
I've never seen that. Can you give an example?

--
www.designacourse.com
The Easiest Way To Train Anyone... Anywhere.

Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation
 
C

Chris Uppal

Chris said:
Of course, there are a number of factors here. Typical C++ code
involves quite a bit fewer virtual method dispatches than typical
Smalltalk code (or even typical Java code).
Indeed.


If the vtables end up
cached, though, the indirection doesn't really cost much of anything at
all.

Depends on machine architecture. I'm no expert in this (nor even a
well-informed non-expert), but you have issues like pipeline stalls,
speculative execution probably isn't possible over an indirection, the CPUs
instruction decoder has probably "read ahead" into the wrong place, ....

-- chris
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top