Difference between revisions of "Pattern Matching Assignment"

From CSE425S Wiki
Jump to navigation Jump to search
 
(32 intermediate revisions by the same user not shown)
Line 2: Line 2:
 
[https://www.coursera.org/learn/programming-languages/supplement/FEymH/hints-and-gotchas-for-section-3 Hints and Gotchas]
 
[https://www.coursera.org/learn/programming-languages/supplement/FEymH/hints-and-gotchas-for-section-3 Hints and Gotchas]
  
=Instructions=
+
=Credit=
 
{{GrossmanCredit}}
 
{{GrossmanCredit}}
 
[https://courses.cs.washington.edu/courses/cse341/13sp/hw3.pdf UW] or [https://www.coursera.org/learn/programming-languages/programming/qeafu/homework-3-instructions Coursera]
 
[https://courses.cs.washington.edu/courses/cse341/13sp/hw3.pdf UW] or [https://www.coursera.org/learn/programming-languages/programming/qeafu/homework-3-instructions Coursera]
  
 
=Code To Implement=
 
=Code To Implement=
 +
{{SMLToImplement|uw3|only_capitals<br/>longest_string1<br/>longest_string_helper<br/>longest_string3<br/>longest_string4<br/>longest_capitalized<br/>rev_string<br/>first_answer<br/>all_answers<br/>count_wildcards<br/>count_wild_and_variable_lengths<br/>count_some_var<br/>check_pat<br/>match<br/>first_match<br/>typecheck_patterns|uw3}}
 +
 
You will define several SML functions. Many will be very short because they will use other higher-order
 
You will define several SML functions. Many will be very short because they will use other higher-order
 
functions. You may use functions in ML’s library; the problems point you toward the useful functions and
 
functions. You may use functions in ML’s library; the problems point you toward the useful functions and
Line 15: Line 17:
 
==String Functions==
 
==String Functions==
  
===1) only_captials===
+
===1) only_capitals===
Write a function only_capitals that takes a string list and returns a string list that has only
+
Write a function <code>only_capitals</code> that takes a <code>string list</code> and returns a <code>string list</code> that has only
 
the strings in the argument that start with an uppercase letter. Assume all strings have at least 1
 
the strings in the argument that start with an uppercase letter. Assume all strings have at least 1
character. Use List.filter, Char.isUpper, and String.sub to make a 1-2 line solution.
+
character. Use [https://smlfamily.github.io/Basis/list.html#SIG:LIST.filter:VAL List.filter], [https://smlfamily.github.io/Basis/char.html#SIG:CHAR.isUpper:VAL Char.isUpper], and [https://smlfamily.github.io/Basis/string.html#SIG:STRING.sub:VAL String.sub] to make a 1-2 line solution.
  
 
===2) longest_string1===
 
===2) longest_string1===
Write a function longest_string1 that takes a string list and returns the longest string in the
+
Write a function <code>longest_string1</code> that takes a <code>string list</code> and returns the longest <code>string</code> in the
 
list. If the list is empty, return "". In the case of a tie, return the string closest to the beginning of the
 
list. If the list is empty, return "". In the case of a tie, return the string closest to the beginning of the
list. Use foldl, String.size, and no recursion (other than the implementation of foldl is recursive).
+
list. Use [https://smlfamily.github.io/Basis/list.html#SIG:LIST.foldl:VAL foldl], [https://smlfamily.github.io/Basis/string.html#SIG:STRING.size:VAL String.size], and no recursion (other than the implementation of foldl is recursive).
  
 
===3) longest_string2===
 
===3) longest_string2===
Write a function longest_string2 that is exactly like longest_string1 except in the case of ties
+
Write a function <code>longest_string2</code> that is exactly like <code>longest_string1</code> except in the case of ties
 
it returns the string closest to the end of the list. Your solution should be almost an exact copy of
 
it returns the string closest to the end of the list. Your solution should be almost an exact copy of
longest_string1. Still use foldl and String.size.
+
longest_string1. Still use [https://smlfamily.github.io/Basis/list.html#SIG:LIST.foldl:VAL foldl] and [https://smlfamily.github.io/Basis/string.html#SIG:STRING.size:VAL String.size].
  
===4) longest_string_helper, longest_string3, longest_string4===
+
===4) longest_string_helper, longest_string3, and longest_string4===
Write functions longest_string_helper, longest_string3, and longest_string4 such that:
+
Write functions <code>longest_string_helper</code>, <code>longest_string3</code>, and <code>longest_string4</code> such that:
* longest_string3 has the same behavior as longest_string1 and longest_string4 has the same behavior as longest_string2.
+
* <code>longest_string3</code> has the same behavior as <code>longest_string1</code> and <code>longest_string4</code> has the same behavior as <code>longest_string2</code>.
* longest_string_helper has type (int * int -> bool) -> string list -> string (notice the currying). This function will look a lot like longest_string1 and longest_string2 but is more general because it takes a function as an argument.
+
* <code>longest_string_helper</code> has type <code>(int * int -> bool) -> string list -> string</code> (notice the currying). This function will look a lot like <code>longest_string1</code> and <code>longest_string2</code> but is more general because it takes a function as an argument.
* If longest_string_helper is passed a function that behaves like > (so it returns true exactly when its first argument is stricly greater than its second), then the function returned has the same behavior as longest_string1.
+
* If <code>longest_string_helper</code> is passed a function that behaves like > (so it returns true exactly when its first argument is strictly greater than its second), then the function returned has the same behavior as <code>longest_string1</code>.
* longest_string3 and longest_string4 are defined with val-bindings and partial applications of longest_string_helper.
+
* <code>longest_string3</code> and <code>longest_string4</code> are defined with val-bindings and partial applications of <code>longest_string_helper</code>.
  
 
===5) longest_capitalized===
 
===5) longest_capitalized===
Write a function longest_capitalized that takes a string list and returns the longest string in
+
Write a function <code>longest_capitalized</code> that takes a <code>string list</code> and returns the longest string in
 
the list that begins with an uppercase letter, or "" if there are no such strings. Assume all strings
 
the list that begins with an uppercase letter, or "" if there are no such strings. Assume all strings
 
have at least 1 character. Use a val-binding and the ML library’s o operator for composing functions.
 
have at least 1 character. Use a val-binding and the ML library’s o operator for composing functions.
Line 44: Line 46:
  
 
===6) rev_string===
 
===6) rev_string===
Write a function rev_string that takes a string and returns the string that is the same characters in
+
Write a function <code>rev_string</code> that takes a <code>string</code> and returns the <code>string</code> that is the same characters in
reverse order. Use ML’s o operator, the library function rev for reversing lists, and two library functions
+
reverse order. Use ML’s o operator, the library function [https://smlfamily.github.io/Basis/list.html#SIG:LIST.rev:VAL rev] for reversing lists, and two library functions
in the String module. (Browse the module documentation to find the most useful functions.)
+
in the [https://smlfamily.github.io/Basis/string.html String] module. (Browse [https://smlfamily.github.io/Basis/string.html the module documentation] to find the most useful functions.)
  
 
==Utilities For Pattern Matching==
 
==Utilities For Pattern Matching==
Line 52: Line 54:
  
 
===7) first_answer===
 
===7) first_answer===
Write a function first_answer of type (’a -> ’b option) -> ’a list -> ’b (notice the 2 arguments are curried). The first argument should be applied to elements of the second argument in order
+
Write a function <code>first_answer</code> of type <code>(’a -> ’b option) -> ’a list -> ’b</code> (notice the 2 arguments are curried). The first argument should be applied to elements of the second argument in order
until the first time it returns SOME v for some v and then v is the result of the call to first_answer.
+
until the first time it returns <code>SOME v</code> for some v and then v is the result of the call to <code>first_answer</code>.
If the first argument returns NONE for all list elements, then first_answer should raise the exception
+
If the first argument returns <code>NONE</code> for all list elements, then <code>first_answer</code> should raise the exception
NoAnswer. Hints: Sample solution is 5 lines and does nothing fancy.
+
<code>NoAnswer</code>. Hints: Sample solution is 5 lines and does nothing fancy.
  
 
===8) all_answers===
 
===8) all_answers===
Write a function all_answers of type (’a -> ’b list option) -> ’a list -> ’b list option
+
Write a function <code>all_answers</code> of type <code>(’a -> ’b list option) -> ’a list -> ’b list option</code>
 
(notice the 2 arguments are curried). The first argument should be applied to elements of the second
 
(notice the 2 arguments are curried). The first argument should be applied to elements of the second
argument. If it returns NONE for any element, then the result for all_answers is NONE. Else the
+
argument. If it returns <code>NONE</code> for any element, then the result for <code>all_answers</code> is <code>NONE</code>. Else the
calls to the first argument will have produced SOME lst1, SOME lst2, ... SOME lstn and the result of
+
calls to the first argument will have produced <code>SOME lst1, SOME lst2, ... SOME lstn</code> and the result of
all_answers is SOME lst where lst is lst1, lst2, ..., lstn appended together (order doesn’t matter).
+
<code>all_answers</code> is <code>SOME lst</code> where <code>lst</code> is <code>lst1, lst2, ..., lstn</code> appended together (order doesn’t matter).
Hints: The sample solution is 8 lines. It uses a helper function with an accumulator and uses @. Note
+
 
all_answers f [] should evaluate to SOME [].
+
Hints: The sample solution is 8 lines. It uses a helper function with an accumulator and uses [https://smlfamily.github.io/Basis/list.html#SIG:LIST.@:VAL @]. Note
 +
<code>all_answers f []</code> should evaluate to <code>SOME []</code>.
  
==Pattern Matching==
+
==Pattern And Valu Datatypes==
 
The remaining problems use these type definitions, which are inspired by the type definitions an ML implementation would use to implement pattern matching:
 
The remaining problems use these type definitions, which are inspired by the type definitions an ML implementation would use to implement pattern matching:
  
<nowiki>datatype pattern = Wildcard  
+
<syntaxhighlight lang="sml">
                          | Variable of string  
+
datatype pattern = Wildcard  
                          | UnitP  
+
                | Variable of string  
                          | ConstP of int
+
                | UnitP  
                          | TupleP of pattern list  
+
                | ConstP of int
                          | ConstructorP of string * pattern
+
                | TupleP of pattern list  
 +
                | ConstructorP of string * pattern
 
datatype valu = Const of int  
 
datatype valu = Const of int  
 
               | Unit  
 
               | Unit  
 
               | Tuple of valu list  
 
               | Tuple of valu list  
 
               | Constructor of string * valu</nowiki>
 
               | Constructor of string * valu</nowiki>
 +
</syntaxhighlight>
  
 
Given valu v and pattern p, either p matches v or not. If it does, the match produces a list of string * valu pairs; order in the list does not matter. The rules for matching should be unsurprising:
 
Given valu v and pattern p, either p matches v or not. If it does, the match produces a list of string * valu pairs; order in the list does not matter. The rules for matching should be unsurprising:
  
* Wildcard matches everything and produces the empty list of bindings.
+
* <code>Wildcard</code> matches everything and produces the empty list of bindings.
* Variable s matches any value v and produces the one-element list holding (s,v).
+
* <code>Variable s</code> matches any value v and produces the one-element list holding (s,v).
* UnitP matches only Unit and produces the empty list of bindings.
+
* <code>UnitP</code> matches only <code>Unit</code> and produces the empty list of bindings.
* ConstP 17 matches only Const 17 and produces the empty list of bindings (and similarly for other integers).
+
* <code>ConstP 17</code> matches only <code>Const 17</code> and produces the empty list of bindings (and similarly for other integers).
* TupleP ps matches a value of the form Tuple vs if ps and vs have the same length and for all i, the ith element of ps matches the ith element of vs. The list of bindings produced is all the lists from the nested pattern matches appended together.
+
* <code>TupleP ps</code> matches a value of the form <code>Tuple vs</code> if ps and vs have the same length and for all i, the ith element of ps matches the ith element of vs. The list of bindings produced is all the lists from the nested pattern matches appended together.
* ConstructorP(s1,p) matches Constructor(s2,v) if s1 and s2 are the same string (you can compare them with =) and p matches v. The list of bindings produced is the list from the nested pattern match. We call the strings s1 and s2 the constructor name.
+
* <code>ConstructorP(s1,p)</code> matches <code>Constructor(s2,v)</code> if s1 and s2 are the same string (you can compare them with =) and p matches v. The list of bindings produced is the list from the nested pattern match. We call the strings s1 and s2 the constructor name.
 
* Nothing else matches.
 
* Nothing else matches.
  
===9) ===
+
==Pattern Matching Rosetta Stone==
 +
<youtube>ZR5oyolk3I8</youtube>
 +
 
 +
<syntaxhighlight lang="sml">
 +
datatype pattern = Wildcard
 +
| Variable of string
 +
| UnitP
 +
| ConstP of int
 +
| TupleP of pattern list
 +
| ConstructorP of string * pattern
 +
type point = (int * int)
 +
type circle = (point * int)
 +
fun is_overlapping_with_origin(circle_option : circle option) : bool =
 +
    case circle_option of
 +
      NONE              (* pattern_a *) => false
 +
    | SOME((x,y),radius) (* pattern_b *) => x*x + y*y <= radius*radius
 +
val false_none = is_overlapping_with_origin(NONE)
 +
val true_r6    = is_overlapping_with_origin(SOME((3,4),6))
 +
val true_r5    = is_overlapping_with_origin(SOME((3,4),5))
 +
val false_r4  = is_overlapping_with_origin(SOME((3,4),4))
 +
val pattern_a = ConstructorP("NONE", UnitP)
 +
val pattern_b = ConstructorP("SOME",
 +
                    TupleP([
 +
                        TupleP([Variable("x"), Variable("y")]),
 +
                        Variable("radius")
 +
                    ])
 +
                )
 +
fun is_radius_42(circle_option : circle option) : bool =
 +
    case circle_option of
 +
      SOME(_,42) (* pattern_c *) => true
 +
    | _          (* pattern_d *) => false
 +
val b =  is_radius_42(SOME((1,2),42))
 +
val pattern_c = ConstructorP("SOME",
 +
                    TupleP([
 +
                        Wildcard,
 +
                        ConstP(42)
 +
                    ])
 +
                )
 +
val pattern_d = Wildcard
 +
</syntaxhighlight>
  
===10) ===
+
==Pattern Matching Functions==
 +
===9) count_wildcards, count_wild_and_variable_lengths, and count_some_var===
 +
(This problem uses the pattern datatype but is not really about pattern-matching.) A function <code>g</code> has been provided to you.
 +
====9a) count_wildcards====
 +
Use <code>g</code> to define a function <code>count_wildcards</code> that takes a pattern and returns how many <code>Wildcard</code>
 +
patterns it contains.
 +
====9b) count_wild_and_variable_lengths====
 +
Use <code>g</code> to define a function <code>count_wild_and_variable_lengths</code> that takes a pattern and returns
 +
the number of <code>Wildcard</code> patterns it contains plus the sum of the string lengths of all the variables
 +
in the variable patterns it contains. (Use [https://smlfamily.github.io/Basis/string.html#SIG:STRING.size:VAL String.size]. We care only about variable names; the
 +
constructor names are not relevant.)
 +
====9c) count_some_var====
 +
Use <code>g</code> to define a function <code>count_some_var</code> that takes a string and a pattern (as a pair) and
 +
returns the number of times the string appears as a variable in the pattern. We care only about
 +
variable names; the constructor names are not relevant.
  
===11) ===
+
===10) check_pat===
 +
Write a function <code>check_pat</code> that takes a pattern and returns true if and only if all the variables
 +
appearing in the pattern are distinct from each other (i.e., use different strings). The constructor
 +
names are not relevant. Hints: The sample solution uses two helper functions. The first takes a
 +
pattern and returns a list of all the strings it uses for variables. Using [https://smlfamily.github.io/Basis/list.html#SIG:LIST.foldl:VAL foldl] with a function that uses
 +
[https://smlfamily.github.io/Basis/list.html#SIG:LIST.@:VAL @] is useful in one case. The second takes a list of strings and decides if it has repeats. [https://smlfamily.github.io/Basis/list.html#SIG:LIST.exists:VAL List.exists] may
 +
be useful. Sample solution is 15 lines. These are hints: We are not requiring [https://smlfamily.github.io/Basis/list.html#SIG:LIST.foldl:VAL foldl] and [https://smlfamily.github.io/Basis/list.html#SIG:LIST.exists:VAL List.exists]
 +
here, but they make it easier.
  
===12) ===
+
===11) match===
 +
Write a function <code>match</code> that takes a <code>valu * pattern</code> and returns a <code>(string * valu) list option</code>,
 +
namely <code>NONE</code> if the pattern does not match and <code>SOME lst</code> where lst is the list of bindings if it does.
 +
Note that if the value matches but the pattern has no patterns of the form <code>Variable s</code>, then the result
 +
is <code>SOME []</code>. Hints: Sample solution has one case expression with 7 branches. The branch for tuples
 +
uses <code>all_answers</code> and [https://smlfamily.github.io/Basis/list-pair.html#SIG:LIST_PAIR.zip:VAL ListPair.zip]. Sample solution is 13 lines. Remember to look above for the
 +
rules for what patterns match what values, and what bindings they produce. These are hints: We are
 +
not requiring <code>all_answers</code> and [https://smlfamily.github.io/Basis/list-pair.html#SIG:LIST_PAIR.zip:VAL ListPair.zip] here, but they make it easier.
 +
 
 +
===12) first_match===
 +
Write a function <code>first_match</code> that takes a value and a list of patterns and returns a
 +
<code>(string * valu) list option</code>, namely <code>NONE</code> if no pattern in the list matches or <code>SOME lst</code> where
 +
lst is the list of bindings for the first pattern in the list that matches. Use <code>first_answer</code> and a
 +
handle-expression. Hints: Sample solution is 3 lines.
  
 
==Challenge Problem==
 
==Challenge Problem==
  
 
===13) typecheck_patterns===
 
===13) typecheck_patterns===
 +
Write a function <code>typecheck_patterns</code> that “type-checks” a pattern list. Types
 +
for our made-up pattern language are defined by:
 +
 +
<syntaxhighlight lang="sml">
 +
datatype typ = Anything (* any type of value is okay *)
 +
            | UnitT (* type for Unit *)
 +
            | IntT (* type for integers *)
 +
            | TupleT of typ list (* tuple types *)
 +
            | Datatype of string (* some named datatype *)</nowiki>
 +
</syntaxhighlight>
 +
 +
<code>typecheck_patterns</code> should have type <code>((string * string * typ) list) * (pattern list) -> typ option</code>.
 +
The first argument contains elements that look like ("foo","bar",IntT), which means constructor foo
 +
makes a value of type Datatype "bar" given a value of type IntT. Assume list elements all have different
 +
first fields (the constructor name), but there are probably elements with the same second field (the datatype
 +
name). Under the assumptions this list provides, you “type-check” the pattern list to see if there exists
 +
some typ (call it t) that all the patterns in the list can have. If so, return <code>SOME t</code>, else return <code>NONE</code>.
 +
 +
You must return the “most lenient” type that all the patterns can have. For example, given patterns
 +
<code>TupleP[Variable("x"),Variable("y")]</code> and <code>TupleP[Wildcard,Wildcard]</code>, return <code>TupleT[Anything,Anything]</code>
 +
even though they could both have type <code>TupleT[IntT,IntT]</code>. As another example, if the only patterns
 +
are <code>TupleP[Wildcard,Wildcard]</code> and <code>TupleP[Wildcard,TupleP[Wildcard,Wildcard]]</code>, you must return
 +
<code>TupleT[Anything,TupleT[Anything,Anything]]</code>.
  
 
=Type Summary=
 
=Type Summary=
<nowiki>val g = fn : (unit -> int) -> (string -> int) -> pattern -> int
+
<syntaxhighlight lang="sml">
 +
val g = fn : (unit -> int) -> (string -> int) -> pattern -> int
 
val only_capitals = fn : string list -> string list
 
val only_capitals = fn : string list -> string list
 
val longest_string1 = fn : string list -> string
 
val longest_string1 = fn : string list -> string
Line 119: Line 221:
 
val check_pat = fn : pattern -> bool
 
val check_pat = fn : pattern -> bool
 
val match = fn : valu * pattern -> (string * valu) list option
 
val match = fn : valu * pattern -> (string * valu) list option
val first_match = fn : valu -> pattern list -> (string * valu) list option</nowiki>
+
val first_match = fn : valu -> pattern list -> (string * valu) list option
 
val typecheck_patterns = fn : (string * string * typ) list * pattern list -> typ option
 
val typecheck_patterns = fn : (string * string * typ) list * pattern list -> typ option
 +
</syntaxhighlight>
  
 
=Test=
 
=Test=
{{SMLUnitTest|hw3|hw3}}
+
{{SMLUnitTesting|run_uw3_testing|uw3}}
  
Test your functions.  The unit tests you have been provided are intentionally minimal.
+
Test your functions.  The unit tests you have been provided by UW are intentionally minimal.
  
 
=Pledge, Acknowledgments, Citations=
 
=Pledge, Acknowledgments, Citations=
{{Pledge|lab-3-card-game-date}}
+
{{Pledge|uw3-pattern-matching}}

Latest revision as of 19:25, 10 July 2023

Hints and Gotchas

Hints and Gotchas

Credit

All credit for this assignment goes to Prof. Grossman and his team at UW. UW or Coursera

Code To Implement

file: src/main/sml/uw3/uw3.sml Smlnj-logo.png
functions: only_capitals
longest_string1
longest_string_helper
longest_string3
longest_string4
longest_capitalized
rev_string
first_answer
all_answers
count_wildcards
count_wild_and_variable_lengths
count_some_var
check_pat
match
first_match
typecheck_patterns

You will define several SML functions. Many will be very short because they will use other higher-order functions. You may use functions in ML’s library; the problems point you toward the useful functions and often require that you use them. The sample solution is about 120 lines, including the provided code, but not including the challenge problem. This assignment is probably more difficult than Homework 2 even though (perhaps because) many of the problems have 1-line answers.

String Functions

1) only_capitals

Write a function only_capitals that takes a string list and returns a string list that has only the strings in the argument that start with an uppercase letter. Assume all strings have at least 1 character. Use List.filter, Char.isUpper, and String.sub to make a 1-2 line solution.

2) longest_string1

Write a function longest_string1 that takes a string list and returns the longest string in the list. If the list is empty, return "". In the case of a tie, return the string closest to the beginning of the list. Use foldl, String.size, and no recursion (other than the implementation of foldl is recursive).

3) longest_string2

Write a function longest_string2 that is exactly like longest_string1 except in the case of ties it returns the string closest to the end of the list. Your solution should be almost an exact copy of longest_string1. Still use foldl and String.size.

4) longest_string_helper, longest_string3, and longest_string4

Write functions longest_string_helper, longest_string3, and longest_string4 such that:

  • longest_string3 has the same behavior as longest_string1 and longest_string4 has the same behavior as longest_string2.
  • longest_string_helper has type (int * int -> bool) -> string list -> string (notice the currying). This function will look a lot like longest_string1 and longest_string2 but is more general because it takes a function as an argument.
  • If longest_string_helper is passed a function that behaves like > (so it returns true exactly when its first argument is strictly greater than its second), then the function returned has the same behavior as longest_string1.
  • longest_string3 and longest_string4 are defined with val-bindings and partial applications of longest_string_helper.

5) longest_capitalized

Write a function longest_capitalized that takes a string list and returns the longest string in the list that begins with an uppercase letter, or "" if there are no such strings. Assume all strings have at least 1 character. Use a val-binding and the ML library’s o operator for composing functions. Resolve ties like in problem 2.

6) rev_string

Write a function rev_string that takes a string and returns the string that is the same characters in reverse order. Use ML’s o operator, the library function rev for reversing lists, and two library functions in the String module. (Browse the module documentation to find the most useful functions.)

Utilities For Pattern Matching

The next two problems involve writing functions over lists that will be useful in later problems.

7) first_answer

Write a function first_answer of type (’a -> ’b option) -> ’a list -> ’b (notice the 2 arguments are curried). The first argument should be applied to elements of the second argument in order until the first time it returns SOME v for some v and then v is the result of the call to first_answer. If the first argument returns NONE for all list elements, then first_answer should raise the exception NoAnswer. Hints: Sample solution is 5 lines and does nothing fancy.

8) all_answers

Write a function all_answers of type (’a -> ’b list option) -> ’a list -> ’b list option (notice the 2 arguments are curried). The first argument should be applied to elements of the second argument. If it returns NONE for any element, then the result for all_answers is NONE. Else the calls to the first argument will have produced SOME lst1, SOME lst2, ... SOME lstn and the result of all_answers is SOME lst where lst is lst1, lst2, ..., lstn appended together (order doesn’t matter).

Hints: The sample solution is 8 lines. It uses a helper function with an accumulator and uses @. Note all_answers f [] should evaluate to SOME [].

Pattern And Valu Datatypes

The remaining problems use these type definitions, which are inspired by the type definitions an ML implementation would use to implement pattern matching:

datatype pattern = Wildcard 
                 | Variable of string 
                 | UnitP 
                 | ConstP of int
                 | TupleP of pattern list 
                 | ConstructorP of string * pattern
datatype valu = Const of int 
              | Unit 
              | Tuple of valu list 
              | Constructor of string * valu</nowiki>

Given valu v and pattern p, either p matches v or not. If it does, the match produces a list of string * valu pairs; order in the list does not matter. The rules for matching should be unsurprising:

  • Wildcard matches everything and produces the empty list of bindings.
  • Variable s matches any value v and produces the one-element list holding (s,v).
  • UnitP matches only Unit and produces the empty list of bindings.
  • ConstP 17 matches only Const 17 and produces the empty list of bindings (and similarly for other integers).
  • TupleP ps matches a value of the form Tuple vs if ps and vs have the same length and for all i, the ith element of ps matches the ith element of vs. The list of bindings produced is all the lists from the nested pattern matches appended together.
  • ConstructorP(s1,p) matches Constructor(s2,v) if s1 and s2 are the same string (you can compare them with =) and p matches v. The list of bindings produced is the list from the nested pattern match. We call the strings s1 and s2 the constructor name.
  • Nothing else matches.

Pattern Matching Rosetta Stone

datatype pattern = Wildcard
		 | Variable of string
		 | UnitP
		 | ConstP of int
		 | TupleP of pattern list
		 | ConstructorP of string * pattern
type point = (int * int)
type circle = (point * int)
fun is_overlapping_with_origin(circle_option : circle option) : bool = 
    case circle_option of
      NONE               (* pattern_a *) => false
    | SOME((x,y),radius) (* pattern_b *) => x*x + y*y <= radius*radius
val false_none = is_overlapping_with_origin(NONE)
val true_r6    = is_overlapping_with_origin(SOME((3,4),6))
val true_r5    = is_overlapping_with_origin(SOME((3,4),5))
val false_r4   = is_overlapping_with_origin(SOME((3,4),4))
val pattern_a = ConstructorP("NONE", UnitP)
val pattern_b = ConstructorP("SOME", 
                    TupleP([
                        TupleP([Variable("x"), Variable("y")]),
                        Variable("radius")
                    ])
                )
fun is_radius_42(circle_option : circle option) : bool = 
    case circle_option of
      SOME(_,42) (* pattern_c *) => true
    | _          (* pattern_d *) => false
val b =  is_radius_42(SOME((1,2),42))
val pattern_c = ConstructorP("SOME", 
                    TupleP([
                        Wildcard,
                        ConstP(42)
                    ])
                )
val pattern_d = Wildcard

Pattern Matching Functions

9) count_wildcards, count_wild_and_variable_lengths, and count_some_var

(This problem uses the pattern datatype but is not really about pattern-matching.) A function g has been provided to you.

9a) count_wildcards

Use g to define a function count_wildcards that takes a pattern and returns how many Wildcard patterns it contains.

9b) count_wild_and_variable_lengths

Use g to define a function count_wild_and_variable_lengths that takes a pattern and returns the number of Wildcard patterns it contains plus the sum of the string lengths of all the variables in the variable patterns it contains. (Use String.size. We care only about variable names; the constructor names are not relevant.)

9c) count_some_var

Use g to define a function count_some_var that takes a string and a pattern (as a pair) and returns the number of times the string appears as a variable in the pattern. We care only about variable names; the constructor names are not relevant.

10) check_pat

Write a function check_pat that takes a pattern and returns true if and only if all the variables appearing in the pattern are distinct from each other (i.e., use different strings). The constructor names are not relevant. Hints: The sample solution uses two helper functions. The first takes a pattern and returns a list of all the strings it uses for variables. Using foldl with a function that uses @ is useful in one case. The second takes a list of strings and decides if it has repeats. List.exists may be useful. Sample solution is 15 lines. These are hints: We are not requiring foldl and List.exists here, but they make it easier.

11) match

Write a function match that takes a valu * pattern and returns a (string * valu) list option, namely NONE if the pattern does not match and SOME lst where lst is the list of bindings if it does. Note that if the value matches but the pattern has no patterns of the form Variable s, then the result is SOME []. Hints: Sample solution has one case expression with 7 branches. The branch for tuples uses all_answers and ListPair.zip. Sample solution is 13 lines. Remember to look above for the rules for what patterns match what values, and what bindings they produce. These are hints: We are not requiring all_answers and ListPair.zip here, but they make it easier.

12) first_match

Write a function first_match that takes a value and a list of patterns and returns a (string * valu) list option, namely NONE if no pattern in the list matches or SOME lst where lst is the list of bindings for the first pattern in the list that matches. Use first_answer and a handle-expression. Hints: Sample solution is 3 lines.

Challenge Problem

13) typecheck_patterns

Write a function typecheck_patterns that “type-checks” a pattern list. Types for our made-up pattern language are defined by:

datatype typ = Anything (* any type of value is okay *)
             | UnitT (* type for Unit *)
             | IntT (* type for integers *)
             | TupleT of typ list (* tuple types *)
             | Datatype of string (* some named datatype *)</nowiki>

typecheck_patterns should have type ((string * string * typ) list) * (pattern list) -> typ option. The first argument contains elements that look like ("foo","bar",IntT), which means constructor foo makes a value of type Datatype "bar" given a value of type IntT. Assume list elements all have different first fields (the constructor name), but there are probably elements with the same second field (the datatype name). Under the assumptions this list provides, you “type-check” the pattern list to see if there exists some typ (call it t) that all the patterns in the list can have. If so, return SOME t, else return NONE.

You must return the “most lenient” type that all the patterns can have. For example, given patterns TupleP[Variable("x"),Variable("y")] and TupleP[Wildcard,Wildcard], return TupleT[Anything,Anything] even though they could both have type TupleT[IntT,IntT]. As another example, if the only patterns are TupleP[Wildcard,Wildcard] and TupleP[Wildcard,TupleP[Wildcard,Wildcard]], you must return TupleT[Anything,TupleT[Anything,Anything]].

Type Summary

val g = fn : (unit -> int) -> (string -> int) -> pattern -> int
val only_capitals = fn : string list -> string list
val longest_string1 = fn : string list -> string
val longest_string2 = fn : string list -> string
val longest_string_helper = fn : (int * int -> bool) -> string list -> string
val longest_string3 = fn : string list -> string
val longest_string4 = fn : string list -> string
val longest_capitalized = fn : string list -> string
val rev_string = fn : string -> string
val first_answer = fn : (’a -> ’b option) -> ’a list -> ’b
val all_answers = fn : (’a -> ’b list option) -> ’a list -> ’b list option
val count_wildcards = fn : pattern -> int
val count_wild_and_variable_lengths = fn : pattern -> int
val count_some_var = fn : string * pattern -> int
val check_pat = fn : pattern -> bool
val match = fn : valu * pattern -> (string * valu) list option
val first_match = fn : valu -> pattern list -> (string * valu) list option
val typecheck_patterns = fn : (string * string * typ) list * pattern list -> typ option

Test

source folder: src/test/sml/uw3
how to run with CM.make verbosity off: sml -Ccm.verbose=false run_uw3_testing.sml
how to run with CM.make verbosity on: sml run_uw3_testing.sml

note: ensure that you have removed all printing to receive credit for any assignment.

SML Error Messages

Test your functions. The unit tests you have been provided by UW are intentionally minimal.

Pledge, Acknowledgments, Citations

file: uw3-pattern-matching-pledge-acknowledgments-citations.txt

More info about the Honor Pledge